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We consider the problem of determining the optimal block (or 
subsample) size for a spatial subsampling method for spatial pro¬ 
cesses observed on regular grids. We derive expansions for the mean 
square error of the subsampling variance estimator, which yields an 
expression for the theoretically optimal block size. The optimal block 
size is shown to depend in an intricate way on the geometry of the 
spatial sampling region as well as characteristics of the underlying 
random field. Final expressions for the optimal block size make use 
of some nontrivial estimates of lattice point counts in shifts of con¬ 
vex sets. Optimal block sizes are computed for sampling regions of 
a number of commonly encountered shapes. Numerical studies are 
performed to compare subsampling methods as well as procedures 
for estimating the theoretically best block size. 

1. Introduction. In this article, the problem of choosing subsample sizes 
is examined to maximize the performance of subsampling methods for vari¬ 
ance estimation. The data at hand are viewed as realizations of a stationary, 
weakly dependent spatial lattice process. We consider the common scenario 
of sampling from sites of regular distance (e.g., indexed by the integer lattice 
Z'^), lying within some region embedded in Such lattice data appear 
often in time series, agricultural field trials, and remote sensing and image 
analysis (medical and satellite image processing). 

Consider estimating the variance of a statistic 9n from Rn- For variance 
estimation via subsampling, the basic idea is to construct several “scaled- 
down” copies (subsamples) of the sampling region Rn that fit inside Rn, 
evaluate the analog of On on each of these subregions, and then compute 
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a properly normalized sample variance from the resulting values. The Rn- 
sampling scheme is essentially recreated at the level of the subregions. Two 
subsampling designs are most typical: Subregions can be maximally over¬ 
lapping (OL) or devised to be nonoverlapping (NOT). The accuracy (e.g., 
variance and bias) of subsample-based estimators depends crucially on the 
choice of subsample size. 

To place our work into perspective, we briefly outline previous research 
in variance estimation with subsamples and theoretical size considerations. 
Variance estimation through subsampling originated from analysis of weakly 
dependent, stationary time processes. Suppose On is an estimator of a pa¬ 
rameter of interest 9 based on {Z(l),..., Z(n)} from a stationary tem¬ 
poral process {Z(i)}j>i. To obtain subsamples for 0„-variance estimation, 
Carlstein (1986) first proposed the use of NOT blocks of length m < n: 
{Z(l + (i — l)m),... ,Z(im)}, i = 1,..., [n/mj, while the sequence of sub¬ 
series {Z(i),..., Z(i + m — 1)}, i = 1,... ,n — m + 1, provides OL subsamples 
of length m [cf. Kiinsch (1989) and Politis and Romano (1993b)]. Here, [xj 
denotes the integer part of a real number x. In each respective subsample 
collection, evaluations of an analog statistic 0i are made for each subseries 
and a normalized sample variance is calculated to estimate the parameter 
nVar(6'„), 


E 


2=1 


m{6i — 6)'^ 
J 


^'=E 


J’ 


where J = [n/m\ {J = n — m+1) for the NOL (OL) subsample-based esti¬ 
mator. Carlstein (1986) and Fukuchi (1999) established the L 2 consistency 
of the NOL and OL estimators, respectively, for the variance of a general 
(not necessarily linear) statistic. Politis and Romano (1993b) determined 
asymptotic orders of the variance 0{mjn) and bias 0{l/m) of the sub¬ 
sample variance estimators for linear statistics. For mixing time series, they 
found that a subsample size m proportional to is optimal in the sense 
of minimizing the mean square error (MSE) of variance estimation, concur¬ 
ring also with optimal block order for the moving block bootstrap variance 
estimator [Hall, Horowitz and Jing (1995) and Lahiri (1996)]. 

Cressie [(1991), page 492] conjectured the recipe for extending Carlstein’s 
variance estimator to the general spatial setting, obtaining subsamples by 
tiling the sample region Rn with disjoint “congruent” subregions. Politis 
and Romano (1993a, 1994) have shown the consistency of subsample-based 
variance estimators for rectangular sampling or subsampling regions in 
when the sampling sites are observed on n nf=i[l)^j] integer trans¬ 
lates of yield the subsamples. Garcia-Soidan and Hall (1997) 

and Possolo (1991) proposed similar estimators under an identical sampling 
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scenario. For linear statistics, Politis and Romano (1993a) determined that 
a subsampling scaling choice 



for some unknown C, minimizes the order of a variance estimator’s asymp¬ 
totic MSE. Sherman and Carlstein (1994) and Sherman (1996) proved the 
MSE-consistency of NOL and OL subsample estimators, respectively, for 
the variance of general statistics in Their work allowed for a more 
flexible sampling scheme: the “inside” of a simple closed curve dehnes a 
set D C [—1,1]^, I? n nD (using a scaled-up copy of D) constitutes the 
set of sampling sites, and translates of mD within nD form subsamples. 
Sherman (1996) minimized a bound on the asymptotic order of the OL 
estimator’s MSE to argue that the best size choice for OL subsamples in¬ 
volves m = 0(n^/^) [coinciding with the above findings of Politis and Ro¬ 
mano (1993a) for rectangular regions in M^]. Politis and Sherman (2001) 
have developed consistent subsampling methods for variance estimation with 
marked point process data [cf. Politis, Romano and Wolf (1999), Chapter 6]. 

Few theoretical and numerical recommendations for choosing subsamples 
have been offered in the spatial setting, especially with the intent of variance 
estimation. As suggested in the literature, an explicit theoretical determina¬ 
tion of optimal subsample size or scaling requires calculation of an order and 
associated proportionality constant for a given sampling region i?„. Even for 
the few sampling situations where the order of optimal subsample size has 
been established, the exact adjustments to these orders are unknown and, 
quoting Politis and Romano (1993a), “important (and difficult) in practice.” 
Beyond the time series case with the univariate sample mean, the influence 
of the geometry and dimension of Rn, as well as the structure of On, on pre¬ 
cise subsample selection has not been explored. We attempt here to advance 
some ideas on the best size choice, both theoretically and empirically, for 
subsamples. 

We work under the “smooth function” model of Hall (1992), where the 
statistic of interest On can be represented as a function of sample means. We 
formulate a framework for sampling in where the sampling region Rn is 
obtained by “inflating” a prototype set in the unit cube in and the sub¬ 
sampling regions are given by suitable translates of a scaled down copy of the 
sampling region Rn ■ We consider both a nonoverlapping version and a (max¬ 
imal) overlapping version of the subsampling method. For each method, we 
derive expansions for the variance and the bias of the corresponding subsam¬ 
ple estimator of Var(0„). The asymptotic variance of the spatial subsample 
estimator for the OL version turns out to be smaller than that of the NOL 
version by a constant factor Ki (say) which depends solely on the geometry 
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of the sampling region R^- In the time series case, Meketon and Schmeiser 
(1984), Kiinsch (1989), Hall, Horowitz and Jing (1995) and Lahiri (1996) 
have shown in different degrees of generality that the asymptotic variance 
under the OL subsampling scheme, compared to the NOL one, is Ki = | 
times smaller. Results of this paper show that for rectangular sampling re¬ 
gions Rn in d-dimensional space, the factor Ki is given by (|)'^- We list the 
factor Ki for sampling regions of some common shapes in Table 1. 

In contrast, the bias parts of both the OL and NOL subsample variance 
estimators are usually asymptotically equivalent and depend on the covari¬ 
ance structure of the random field as well as on the geometry of the sampling 
region Rn. Since the bias term is typically of the same order as the number of 
lattice points lying near a subsample’s boundary, determination of the lead¬ 
ing bias term involves some nontrivial estimates of the lattice point counts 
over translated subregions. Counting lattice points in scaled-up sets is a hard 
problem and has received a lot of attention in analytic number theory and 
in combinatorics. Even for the case of the plane (i.e., d = 2), the counting 
results available in the literature are directly applicable to our problem only 
for a very restricted class of subregions that have the so-called “smoothly 
winding border” [cf. van der Corput (1920) and Huxley (1993, 1996)]. Here 
explicit expressions for the bias terms are derived for a more general class 
of sampling regions using some new estimates on the discrepancy between 
the number of lattice points and the volume of the shifted subregions in the 
plane and in three-dimensional Euclidean space. In particular, our results 
are applicable to sampling regions that do not necessarily have “smoothly 
winding borders.” 

Minimizing the combined expansions for the bias and the variance parts, 
we derive explicit expressions for the theoretical optimal block size for sam¬ 
pling regions of different shapes. To briefly describe the result for a few 
common shapes: Suppose the sampling region Rn is obtained by inflating 
a given set Rq £ by a scaling constant An as Rn = A^Ro and that 

the subsamples are formed by considering the translates of gRn = sAnRo- 
Then the theoretically optimal choice of the subsample size ^A^ for the OL 
version is of the form 


,,A°Pt = 


/ xjBi 

\dKQT^) 


(1 -|- o(l)) as n —> oo 


Table 1 

Examples of Ki for several shapes of the sampling region Rn C 


Shape of 

Rectangle in R*^ 

Sphere in R® 

Circle in R'^ 

Right triangle in R'^ 

Ki 

{2/if 

177r/315 

7r/4 — 4/(37r) 

1/5 
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for some constants Bq and Kq (coming from the bias and the variance terms, 
respectively) where is a population parameter that does not depend on 
the shape of the sampling region Rn (see Theorem 5.1 for details). Table 2 
lists the constants Bq and Kq for some shapes of Rn- It follows from Table 
2 that, unlike the time series case, in higher dimensions the optimal block 
size critically depends on the shape of the spatial sampling region Rn- It 
simplifies only slightly for the NOT subsampling scheme as the constant Kq 
is unnecessary for computing optimal NOT subsamples, but the bias con¬ 
stant Bq is often the same for estimators from each version of subsampling. 
These expressions may be readily used to obtain estimates of the theoretical 
optimal subsample scaling for use in practice. 

The rest of the paper is organized as follows. In Section 2 we describe the 
spatial subsampling method and state the assumptions used in the paper. 
In Sections 3 and 4 we, respectively, derive expansions for the variance and 
the bias parts of the subsampling estimators. Theoretical optimal subsample 
scalings (or block sizes) are derived in Section 5. The results are illustrated 
with some common examples in Section 6. Section 7 describes two methods 
for estimating optimal subsample scaling. In Section 8 a numerical study of 
subsample variance estimators and scaling estimation methods is provided. 
Proofs of variance and bias results are separated into Sections 9 and 10, 
respectively. 

2. Variance estimators via subsampling. In Section 2.1 we frame the 
sampling design and the structure of the sampling region. Two methods of 
subsampling are presented in Section 2.2 along with corresponding nonpara- 
metric variance estimators. Assumptions and conditions used in the paper 
are given in Section 2.3. 

2.1. The sampling structure- To describe the sampling scheme used, we 
first assume all potential sampling sites are located on a translate of the 


Table 2 

Examples of Bq, Kq for some sampling regions Rn* 


Rn 

Sphere in R® 

Cross in R^ O 

Right triangle in R^ 


Ko 

34/105 

4/9 • 191/192 

2/5 


Bo 

3/2Eka3 l|k|k(k) 

4/3X)kgz;2 1 k| i(T(k) 

^ ^-^k— (fci ,sign fei —sign ^2 ^ ^ 

^ ,sign fciT^sign fe2 

||icr(k) 

oocr(k) 


‘Cross and triangle shapes appear in Figure 1; see Section 6 for further details. Auto¬ 
covariances (j(-) and Euclidean, P, and l°° norms || • ||, || • ||i, || • |l°° ^^e described in 
Section 2.3. 
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rectangular integer lattice in For a fixed (chosen) vector t £ [—1/2,1/2)'^, 
we identify the t-translated integer lattice as = t + Let {-Z’(s) :s £ 
Z'^} be a stationary weakly dependent random field (hereafter r.f.) taking 
values in RP. [We use bold font as a standard to denote vectors in the 
space of sampling and normal font for vectors in RP, including Z{-).] We 
suppose that the process Z{-) is observed at sampling sites lying within the 
sampling region C R'^. That is, the collection of available sampling sites 
is {^(s) :s€ Rnf] Z"^}. 

To obtain the results in the paper, we assume that the sampling region 
Rn becomes unbounded as the sample size increases. This will provide a 
commonly used “increasing domain” framework for studying asymptotics 
with spatial lattice data [cf. Cressie (1991)]. We next specify the structure 
of the regions Rn and employ a formulation similar to that of Lahiri (1999a, 
2004). 

Let Rq be a Borel subset of (—1/2,1/2]'^ containing an open neighborhood 
of the origin such that for any sequence of positive real numbers a„ ^ 0, the 
number of cubes of the scaled lattice which intersect the closures Rq 

and Rq is 0((a“^)‘^“^) as n ^ oo. Let A„ be a sequence of d x d diago¬ 
nal matrices, with positive diagonal elements ..., such that each 

A) ^ ^ oo as n —> oo. We assume that the sampling region Rn is obtained 
by “inflating” the template set Rq by the directional scaling factors A„; 
namely, 

Rn — A^1?q. 

Because the origin is assumed to lie in Rq , the sampling region Rn grows out¬ 
ward in all directions as n increases. Furthermore, if the scaling factors are 
all equal (A^”^ = ... = A^”^), the shape of Rn remains the same for different 
values of n. 

The formulation given above allows the sampling region Rn to have a 
large variety of fairly irregular shapes with the boundary condition on Rq 
imposed to avoid pathological cases. Some common examples of such regions 
are convex subsets of R'^, such as spheres, ellipsoids, polyhedrons, as well as 
certain nonconvex subsets with irregular boundaries, such as star-shaped re¬ 
gions. Sherman and Carlstein (1994) and Sherman (1996) consider a similar 
class of such regions in the plane (i.e., d = 2) where the boundaries of the 
sets Rq are delineated by simple rectifiable curves with finite lengths. The 
border requirements on Rq ensure that the number of observations near the 
boundary of Rn is negligible compared to the totality of data values. 

2.2. Subsampling designs and variance estimators. We suppose that the 
relevant statistic, whose variance we wish to estimate, can be represented 
as a function of sample means. Let 9n = H{Z]\f^) be an estimator of the 
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population parameter of interest 0 = H{^), where if —> M is a smooth 
function, EZ(t) = ^ G is the mean of the stationary r.f., and Z]\f^ is the 
sample mean of the Nn observations within i?„, 

( 2 . 1 ) Zn„=N-^ Y. ^(«)- 

sez^i-rRn 

This parameter and estimator formulation is what Hall (1992) calls the 
“smooth function” model and it has been used in other scenarios, such as 
with the moving block bootstrap (MBB), for studying approximately linear 
functions of a sample mean [cf. Lahiri (1996) and Politis, Romano and Wolf 
(1999)]. By considering suitable functions of the Z(s)’s, one can represent a 
wide range of estimators under the present framework. In particular, these 
include means, products and ratios of means, sample moments, spatial cor- 
relograms. Yule-Walker estimates for autoregressive processes [cf. Guyon 
(1995)1 and some pseudo likelihood-based estimators of process parameters 
[cf. Ripley (1981)]. 

The quantity which we seek to estimate nonparametrically is the variance 
of the normalized statistic y/N^6n, say, = A^„E(0„ — E0„)^. In our prob¬ 
lem, this goal is equivalent to consistently estimating the limiting variance 
= lim 

/ 11111,2—>oo ' n • 

2.2.1. Overlapping subsamples. Variance estimation with OL subsam¬ 
pling regions has often been considered in the literature, though in more nar¬ 
row sampling situations [cf. M^-sampling regions, Sherman (1996); M'^-rectangular 
regions, Politis and Romano (1994); time series data, Politis and Romano 
(1993a)]. 

We first consider creating a smaller version of Rn, which will serve as a 
template for the OL subsampling regions. To this end, let s^n he a d x d 
diagonal matrix with positive diagonal elements, ..., sA^”^}, such that 

—> 0 and sA)”'^ ^ oo, as n —> oo, for each i = 1,... ,d. (The matrix 
An represents the determining scaling factors for Rn and sA„ shall be fac¬ 
tors used to define the subsamples.) We make the “prototype” subsampling 
region 

( 2 . 2 ) sRn = s‘^nRo, 

and identify a subset of say JqL) corresponding to all integer translates 
of sRn lying within Rn- That is, 

Jql = {i £ • i + sRn C Rn}- 

The desired OL subsampling regions are precisely the translates of sRn given 
by Ri,n = i + sRn, i £ Jql- Note that the origin belongs to Jql and some of 
these subregions may clearly overlap. 
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Let = |Z'^ n be the number of sampling sites in gRn and let 
|«/ol| denote the number of available subsampling regions. The number of 
sampling sites within each OL subsampling region is the same, namely for 
any i £ Jql, s^n = |Z'^ n Ri^n\- For each i G Jql, compute 6?^ = 
where 

Zi,n = sNn-^ E 

sGZ‘^nRi,„ 

denotes the sample mean of observations within the subregion. We then have 
the OL subsample variance estimator of as 

<OL = I^OLr' E sNn{9^t-0n^?, 

i&JoL 

0^^=\jol\-\j: c- 

iG JoL 

2.2.2. Nonoverlapping subsamples. To create NOL subsamples, we adopt 
a formulation similar to that of Sherman and Carlstein (1994) and Lahiri 
(1999a). The sampling region Rn is hrst divided into disjoint “cubes.” Let 
sAn be the previously described dx d diagonal matrix from (2.2), which will 
determine the “window width” of the partitioning cubes. Let 

^NOL = {i G : .A„(i + (-1/2,1/2]'^) C Rn} 

represent the set of all “inflated” subcubes that lie inside Rn- Denote its 
cardinality as |Jnol|- For each i G JnoLj dehne the subsampling region 
Ri^n = sAn{i + Rq) by inscribing the translate of sA„i?o such that the origin 
is mapped onto the midpoint of the cube <jA„(i + (—1/2,1/2]*^). This pro¬ 
vides a collection of NOL subsampling regions, which are smaller versions 
of the original sampling region Rn that lie inside Rn- 

For each i G JnoLj the function H{-) is evaluated at the sample mean, say 
Zi^n, for a corresponding subsampling region Ri n to obtain = H{Z{^n)- 
The NOL subsample estimator of is again an appropriately scaled sample 
variance, 

T^,NOL = I^NOLr' E 
ieJ^oL 

2NOL_I T 1-1 aNOL 

“n ONOlI / ^ 9\,n 1 

iGJnol 

where = |Z'^n denotes the number of sampling sites within a given 
NOL subsample. 

We note that may differ between NOL subsamples, but all such sub¬ 
samples will have exactly sRl\,n = s^n sites available if the diagonal elements 
of <jA„ are integers. 
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2.3. Assumptions. For stating the assumptions, we need to introduce 
some notation. For a vector x= (xi,..., e let ||x|| and ||x||i = 

J2i=i denote the usual Euclidean and norms of x, respectively. De¬ 
note the norm as ||x||oo = maxi<fc<(i |xfc|. Dehne dis(£'i,£' 2 ) =inf{||x — 
y||oo : X G El, y G £' 2 } for two sets Ei, E 2 C M'^. We shall use the notation | • | 
also in two other cases: for a countable set B, \B\ will denote the cardinality 
of the set E; for an uncountable set A C |^| will refer to the volume 
(i.e., the Lebesgue measure) of A. 

Let iFz{T) = a{Z{s):s G T) be the cr-held generated by the variables 
{Z(s):sGr}, TCZ'^. For Ti,T 2 (ZZ^, write d(ri, £ 2 ) = sup{|E(^ n E)- 
P{A)P{B)\ :A G Pz{Ti),B G £^(£ 2 )}. Then the strong mixing coefficient 
for the r.f. Z(-) is dehned as 

(2.3) q;(A:,0 =sup{q;(£i,£ 2 ) :£i C Z'^, |£i|</, 1 = 1,2] dis(£i,£ 2 ) >/c}. 

Note that the supremum in the definition of a{k,l) is taken over sets £i ,£2 
which are bounded. For d > 1 this is important. An r.f. on the lattice 
with d>2 that satisfies a strong mixing condition of the form 

(2.4) lim sup{q:(£i,£ 2 ) :£i ,£2 C Z^, dis(£i,£ 2 ) > A:} = 0 

k^oo 

with supremum taken over possibly unbounded sets necessarily belongs to 
the more restricted class of p-mixing r.f.’s [cf. Bradley (1989)]. Politis and 
Romano (1993a) use moment inequalities based on the mixing condition 
in (2.4) to determine the orders of the bias and variance of ql, jsjql 
rectangular sampling regions. 

For proving the subsequent theorems, Assumptions A.1-A.5 are needed 
along with two conditions stated as functions of a positive argument r G 
Z_|_ = {0,1,2,...}. In the following, det(A) represents the determinant of a 
square matrix A. For a = (ai,..., ap)' G (Z+)^’, let D°‘ denote the ccth order 
partial differential operator ■■■dxp^ and V = [dH{p) / dxi,... ,dH{p) / dxp)' 

be the vector of first-order partial derivatives of E at p. Limits in order sym¬ 
bols are taken letting n tend to infinity. 


Assumptions. 


A.l. There exists a d x d diagonal matrix Aq, det(Ao) > 0, such that 

Tpiy^An ^ Aq. 

A.2. For the scaling factors of the sampling and subsampling regions 


1 


^ \(«) ^ \i^) 

1=1 2=1 A- 


I.aS") , [det(,A„)](''+i)/'' 


-L 


det(A^ 


= 0(1), 


max 

l<i<d 


= of min A-^^V 
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A.3. There exist nonnegative functions ai(-) and g{-) such that limfc^ooOi(fc) 
lim;_>oo5'(0 = oo and the strong-mixing coefficient a{k,l) from (2.3) 
satisfies the inequality 

a{k,l) < ai{k)g{l), k>0,l>0. 

A.4. sup{d(Ti,r2):Ti,r2CZ^ iTij = 1, dis(ri,r2) > fc} = 

A.5. > 0, where *^(k) = Cov(V'Z(t), V'Z(t -|- k)). 

Conditions. 

Dr- —> M is r-times continuously differentiable and, for some a G Z_|_ 

and real C > 0, 

max{|T>^iJ(x)|: \\u\\i = r} < C(1 -h ||x||“), x E 
Mr- For some 0 < 5 < 1, 0 < k < (2r — 1 — l/(i)(2r -|- 5)/5, and C > 0, 

E||Z(t)f-+'5<oo, 

OO 

m=l 

g{x)<Cx'^, xE[1,oo). 

Some comments about the assumptions and the conditions are in order. 
Assumption A.5 implies a positive, finite asymptotic variance for the 
standardized estimator yJNnOn- 

In Assumption A.3 we formulate a conventional bound on the mixing co¬ 
efficient a{k,l) from (2.3) that is applicable to many r.f.’s and resembles 
the mixing assumption of Lahiri (1999a, 2004). For r.f.’s satisfying Assump¬ 
tion A.3, the “distance” component of the bound, ai(-), often decreases at 
an exponential rate while the function of “set size,” g{-), increases at a poly¬ 
nomial rate [cf. Guyon (1995)]. Examples of r.f.’s that meet the requirements 
of Assumption A.3 and Condition include Gaussian fields with analytic 
spectral densities, certain linear fields with a moving average or autoregres¬ 
sive (AR) representation (like m-dependent fields), separable AR(1) x AR(1) 
lattice processes suggested by Martin (1990) for modeling in many Gibbs 
and Markov fields, and important time series models [cf. Doukhan (1995)]. 
Gondition combined with Assumption A.3 also provides useful moment 
bounds for normed sums of observations (see Lemma 9.2). 

Assumption A.4 permits the GLT in Bolthausen (1982) to be applied to 
sums of Z{-) on sets of increasing domain, in conjunction with the bound¬ 
ary condition on Rq, Assumption A.3 and Gondition M^.. This version of 
the GLT (Stein’s method) is derived from a-mixing conditions which en¬ 
sure asymptotic independence between a single point and observations in 
arbitrary sets of increasing distance [cf. Perera (1997)]. 
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Assumptions A.l and A.2 set additional guidelines for how sampling and 
subsampling design parameters, and sA„, may be chosen. The assump¬ 
tions provide a flexible framework for handling “increasing domains” of many 
shapes. For d = 1, Assumptions A.l and A.2 are equivalent to the require¬ 
ments of Lahiri (1999b) who provides variance and bias expansions for the 
MBB variance estimator with weakly dependent time processes. 


3. Variance expansions. We now give expansions for the asymptotic vari¬ 
ance of the OL/NOL subsample variance estimators t^ql "^nNOL 
= V„Var(0„). 


Theorem 3.1. Suppose that Assumptions A.1-A.5 and Conditions D 2 and M^j^ 2 a 
hold with a as specified under Condition D 2 . Then, 

det(sA^) ^ 


(a) 

(b) 

where 


= Ki 


0 


det(Aj; 


■M(i + o(i)), 


1 


Var(f„_OL) 

Var(fl„oL) 1^1 

If |(x-LAo)nAoP 

Kq = 


dx 


\Ro\ W 

is an integral with respect to the Lebesgue measure. 


The constant Kq appearing in the variance expansion of the estimator 
OL is ^ property of the shape of the sampling template Rq but not of 
its exact embedding in space or even the scale of the set. Namely, Kq is 
invariant to invertible affine transformations applied to Rq and hence can 
be computed from either Rq or = A^Aq. Values of Kq for some template 
shapes are given in Table 3 and Section 6. 

A stationary time sequence Z{1),..., Z{n) can be obtained within our 
sampling formulation by choosing Rq = (—1/2,1/2] and = re on the un¬ 
translated integer lattice Z = Z. In this special sampling case, an application 


Table 3 

Examples of Kq from Theorem 3.1 for several shapes of Ro C 


Ro Shape 

R"* Rectangle 

R® Ellipsoid 

R® Cylinder R^ Ellipse 

R^ Trapezoid* 

Ko 

(2/3)'^ 

34/105 

2/3(1 - 16/(371^)) 1-16/(371^) 

2/5(1-t4c/9) 


*The trapezoid has a 90° interior Z and parallel sides 62 > 61 ; c = (fo 2 /&i -t 1) ^[1 - 1 - 2 ( 62/61 — 

1 )/( 62 / 6 i + 1)]. 
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of Theorem 3.1 yields 

Var(f^ ol) = 2/3 • Var(f^ ^ol)) 
V^r('^n,NOL) = ^ • [2'7‘'^](1 + o(1))) 


a result which is well known for “nearly” linear functions On of a time se¬ 
ries sample mean [cf. Kiinsch (1989)]. Theorem 3.1 implies that, under the 
“smooth” function model, the asymptotic variance of the OL subsample- 
based variance estimator is always strictly less than the NOT version be¬ 
cause 


(3.1) 


Ki = lim 

rx—>-oo 


Var(f^^OL) 

Var(f2NOL) 


-fi^ol-Rol < 1- 


If both estimators have the same bias (which is often the case), (3.1) implies 
that variance estimation with OL subsamples is asymptotically more efficient 
than the NOL subsample alternative owing to a smaller asymptotic MSE. 

Unlike Kq, Ki does depend on the volume |iio|) which in turn is con¬ 
strained by the iio-template’s geometry. Through |i?o| in (3.1), Ki is ulti¬ 
mately bounded by the amount of space that an object of Rqs shape can 
possibly occupy within (—1/2,1,2]'^ [i.e., by how much volume can be filled 
by a given geometrical body (e.g., circle) compared to a cube]. The con¬ 
stants Ki in Table 1 are computed with templates of prescribed shape and 
largest possible volume in (—1/2,1/2]'^. These values most accurately reflect 
the influence of Rq’s (or Rns) geometry on the large-sample relative perfor¬ 
mance of OL "^nNOL terms of variance in (3.1) and also efficiency 
(see Section 5). 

To conclude this section, we remark that both subsample-based variance 
estimators can be shown to be MSE-consistent under Theorem 3.1 condi¬ 
tions, allowing for more general spatial sampling regions, in both shape and 
dimension, than previously considered. Inference on the parameter 9 can be 
made through the limiting standard normal distribution of \/Nn{ 6 n — 0 )jfn 
for Tn — OL fn^NOL- 


4. Bias expansions. We now try to capture and precisely describe the 
leading order terms in the asymptotic bias of each subsample-based variance 
estimator, similar to the variance determinations from the previous section. 
We hrst establish and note the order of the dominant component in the bias 
expansions of ql and nolj which is the subject of the following lemma. 


Lemma 4.1. With Assumptions A.1-A.5, suppose that Conditions D 2 and M 2 +a 
hold for d> 2 or that D 3 and Ms+q hold for d = \ (where a is as spec¬ 
ified by the respective Condition Dy). Then the subsample estimators of 
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= A^n Var(0„) have expectations 

E'(Tn,OL) = + 0(l/sA^ and E(f^ nOl) = "^n + 0(l/sA^ ^). 

The lemma shows that, under the smooth function model, the asymptotic 
bias of each estimator is 0(l/sA^”^) for all dimensions of sampling. Politis 
and Romano (1993a) and Sherman (1996) showed this same size for the 
bias of OL with sampling regions based on rectangles Rq = (—1/2,1/2]'^ 
or simple closed curves in M^, respectively. Lemma 4.1 extends these results 
to a broader class of sampling regions. However, we would like to precisely 
identify the 0(1 /sA(^”^) bias component for ql or j,^ql to obtain optimal 
snbsample scaling that accounts for the geometry of Rn- 

To achieve some measure of success in determining the exact bias of the 
snbsampling estimators, we reformulate the subsampling design slightly so 
that s^n = sA^”"^ = ... = sA^""^. That is, a common scaling factor in all di¬ 
rections is now used to define the subsampling regions, as in Sherman and 
Carlstein (1994) and Sherman (1996). This constraint will allow us to deal 
with the counting issues at the heart of the bias expansion. 

Adopting a common scaling factor ^A^ for the subsamples also is sensible 
for a few other reasons at this stage: 

1. “Unconstrained” optimum values of cannot always be found by 
minimizing the asymptotic MSE of ql '^nNOL; even for variance 
estimation of some desirable statistics on geometrically “simple” sam¬ 
pling and subsampling regions. Consider estimating the variance of a 
real-valued sample mean over a rectangnlar sampling region in based 
on Rq = (—1/2,1/2]^^, with observations on If Assumptions A.l- 

A.5 and Condition Mi hold, the leading term in the bias expansion can 
be shown to be 

Bias of f2oL= + 

V i=i sA) / 

Li= Y. \ki\Cov{Z{0),Zik)). 

kez'^ 

In using the parenthetical sum above to expand the MSE of f^oL) 
finds that the resulting MSE cannot be minimized over the permissible, 
positive range of s^n if the signs of the Li values are unequal. That 
is, for d > 1, the subsample estimator MSE cannot always be globally 
minimized to obtain optimal subsample factors by considering just 
the leading order bias terms. An effort to determine and incorporate (into 
the asymptotic MSE) second- or third-order bias components quickly 
becomes intractable, even with rectangular regions. 
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2. The diagonal components of sA„ are asymptotically scalar multiples of 
each other by Assumption A.l. If so desired, a template choice for Rq 
could be used to scale the expansion of the subsampling regions in each 
direction. 

In the continuing discussion, we assume 
(4.1) sRn — s^nRo- 

We frame the components necessary for determining the biases of the spatial 
subsample variance estimators in the next theorem. Let 

Cn(k) = |Z^ n sRn n (k + sRn)\ 

denote the number of pairs of observations in the subsampling region sRn 
separated by a translate k G Z'^. 


Theorem 4.1. Suppose that d>2, gRn = s^nRo and Assumptions A.l- 
A.5, Conditions and M^+a hold with a as specified under Condition D^. 
If, in addition, s^n £ for NOL subsamples and 


(4.2) 


lim 

n—»-oo 


sNr, - Cn{k) 


C(k) 


exists for all k G Z*^, then 




i:c(kMk) (1+0(1)), 

s n| Ol / 


where a{\C) = Cov(V'Z(t), V'Z(t + k)) andwheref^ is either o^'^nNOL- 


Note that the numerator on the left-hand side of (4.2) is the number 
of grid points that lie in the subregion sRm but not in the translate 
k-|-si?„. Hence, computing the bias above actually requires counting the 
number of lattice points inside intersections like gRn H k -|- sR-n-, which is 
difficult in general. To handle the problem, one may attempt to estimate 
the count CnifA) with the corresponding Lebesgue volume, \sRn H k + sRn\, 
and then quantify the resulting approximation error. The determination of 
volumes or areas may not be easy either but hopefully more manageable. 
For example, if Rq is a circle, the area of s^nRo can be readily computed, 
but the number of Z^ integers inside s^nRo is not so simple and was in fact 
a famous consideration of Gauss [cf. Kratzel (1988), page 141]. 

We first note that the boundary condition on Rq provides a general (triv¬ 
ial) bound on the discrepancy between the count Cn(k) and the volume 
\sRn n k -|- sRn\ ■ 0{sXn'^~^). However, the size of the numerator in (4.2) is 
also ©(sAn'^”^), corresponding to the order of Z'^ lattice points “near” the 
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boundary of sRn- Consequently, a standard 0(sAn'^~^) bound on the volume- 
count approximation error is too large to immediately justify the exchange 
of volumes \sRn\, |s-Rn H k-|-| for counts s^n, ^^(k) in (4.2). 

Bounds on the difference between lattice point counts and volumes have 
received much attention in analytic number theory, which we brieffy mention. 
Research has classically focused on sets outlined by “smooth” simple closed 
curves in the plane and on one question in particular [Huxley (1996)]: 
When a curve with interior area A is “blown up” by a factor 6, how large 
is the difference between the number of integer points inside the new 
curve and the area b'^Al For convex sets with a smoothly winding border, 
van der Corput’s (1920) answer to the posed question above is 0(6^®/®®+^), 
while the best answer is for curves with sufficiently differentiable 

radius of eurvature [Huxley (1993, 1996)]. These types of bounds, however, 
are invalid for many convex polygonal templates Rq in such as triangles, 
trapezoids, and so on, where often the difference between number of 1 ? 
integer points in gRn = s^nRo and its area is of exact order 0 {sXn) (set 
also by the boundary condition on Rq or the perimeter length of sRn)- 
The problem above, as considered by number theorists, does not directly 
address counts for intersections between an expanding region and its vector 
translates, for example, gRn H k -|- sRn- 

To eventually compute closed-form bias expansions for q^, we use ap¬ 
proximation techniques for subtraeted lattice point counts. For each k £ Z'^, 
we: 

1. Replace the numerator of (4.2) with the difference of corresponding Lebesgue 
volumes. 

2. Show the following error term is of sufficiently small order o(sAn“^): 

{sNn — Cn (k) ) ~ (s An] Rq \ — \ sRn H k -j- \) 

= (s-A^n “ sAnlRoj) ~ (C'n(k) — \sRn H k -|- sRnj)- 

We do approximate the number of lattice points in gRn and gRn H k-|- 
by set volumes, though the Lebesgue volume may not adequately capture the 
lattice point count in either set. However, the differenee between approxi¬ 
mation errors — sA^lRol and (^^(k) — n k -|- sRn\ can be shown to 
be asymptotically small enough, for some templates Rq, to justify replacing 
counts with volumes in (4.2) (see Lemma 10.4). That is, these two volume 
count estimation errors can cancel to a sufficient extent when subtracted. 
The above approach becomes slightly more complicated for NOL subsam¬ 
ples, Ri^n = sA„(i -|- Rq), which may vary in number of sampling sites 
In this case, errors incurred by approximating counts jZ'^ n Ri n H k -|- Ri^„j 
with volumes jRi^„ n k-L Ri,n| are shown to be asymptotically negligible, 
uniformly in i £ Jnol • 
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In the following theorem, we use this technique to give bias expansions 
for a large class of sampling regions in d< 3, which are “nearly” convex. 
The sampling region Rn may differ from a convex set possibly only at its 
boundary, but sampling sites on the border may be arbitrarily included or 
excluded from Rn- 

Some notation is additionally required. For a = (ai,...,ap)' e (^+)^, 
X G MP, write x“ = = nf=i(Q^iO) Cq = D°‘H(fj.)/al. Let 

Zoo denote a random vector with a normal AA(0,Soo) distribution on 
where is the limiting covariance matrix of the scaled sample mean 
\/Nn{Zj\f^ — (jl) from (2.1). Let B°, B denote the interior and closure of 
B cMJ^, respectively. 


Theorem 4.2. Suppose sRn = s^nRo and there exists a convex set B 
such that B° <Z Rq G B. With Assumptions A.2-A.5, assume Conditions 
and M^^a+a hold for d G {1,2,3} (where a is as specified by the re¬ 
spective Condition D,.). Then 

C{k) = V{k)= hm \sRn\-\sRnnik + sRn)\ 
n^oo {sKY-^ 

whenever V (k) exists and the biases E(f'^ q^) — E(f^ nol) “ '^n are equal 
to, for d = l, 

for d = 2 or 3, 

(- ~ ^ + sRn )\ ^ 


or 


—— j: r(kWk) (1 + 0(1)), 

s n| Ol Vi^ggd / 


provided eachVfk) exists, w/iere cj(k) = Cov(V'Z(t), V'Z(t + k)) and 



||q;||i = 1 

ll/3||i=3 


+ 2 E 

ki,k2eZ 


E 


Ca*^(/3+7) 

C^TE! 


ll/3||i=Lll7lli=l 


X E([Z(t) - /r]“[Z(t + ki) - fj,f[Zit + k2) - fiD- 
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Remark 4.1. If Condition Dm holds with C = 0 for some m £ {2,3,4}, 
then Condition Mm-i is sufficient in Theorem 4.2. 

Remark 4.2. For each k e the numerator in R(k) is 
by the i?o-boundary condition which holds for convex templates. We may 
then expand the bias of the estimators through the limiting, scaled volume 
differences V (k). For d = 1, with samples and subsamples based on intervals, 
it can be easily seen that R(k) = |k|, which appears in Theorem 4.2. 

The function ff(-) needs to be increasingly “smoother” to determine the 
bias component of ql or in lower-dimensional spaces d = 1 or 2. 

For a real-valued time series sample mean dn = ^n, the well-known bias 
of the subsample variance estimators follows from Theorem 4.2 under our 
sampling framework i?o = 1/2], Z = Z as 

(4.3) E |k|Cov(V'Z(0),V'Z(k)) 

VkGZ 

with V = 1. In general though, terms in the Taylor expansion of (around 
/i) up to fourth order can contribute to the bias of ql and j^jql when 
d = 1. In contrast, the asymptotic bias of the time series MBB variance esti¬ 
mator with “smooth” model statistics is very different from its subsample- 
based counterpart. The MBB variance estimator’s bias is given by (4.3), 
determined only by the linear component from the Taylor expansion of d; „ 

[cf. Lahiri (1996)]. 

5. Asymptotically optimal subsample sizes. In the following, we consider 
“size” selection for the subsampling regions to maximize the large-sample 
accuracy of the subsample variance estimators. For reasons discussed in Sec¬ 
tion 4, we examine a theoretically optimal scaling choice for subregions 
in (4.1). 

5.1. Theoretical optimal subsample sizes. Generally speaking, there is a 
trade-off in the effect of subsample size on the bias and variance of ql or nol- 
Increasing ^A^ reduces the bias but increases the variance of the estimators. 

The best value of s^n optimizes the overall performance of a subsample 
variance estimator by balancing the contributions from both the estimator’s 
variance and bias. An optimal s^n choice can be found by minimizing the 
asymptotic order of a variance estimator’s MSE under a given OL or NOL 
sampling scheme. 

Theorem 4.1 implies that the bias of the estimators t^ql and t^nql 
is of exact order 0{l/sXn)- For a broad class of sampling regions Rn, the 
leading order bias component can be determined explicitly with Theorem 
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4.2. We bring these variance and bias expansions together to obtain an 
optimal subsample scaling factor 

Theorem 5.1. Let sRn = s^nRo- With Assumptions A.2"A.5, assume 
Conditions D 2 and M^^ 2 a hold if d> 2 or Conditions D 3 and M 7 + 2 a hold 
if d=l (where a is as specified by the respective Condition Dr). If 

Bo\Ro\^ C(kV(k)Coo/O, 

keZ'^ 

then 


and 


Remark 5.1. If Condition Dm holds with C = 0 for some m G {2,3}, 
then Condition M 2 m-i is sufficient. 

Remark 5.2. Theorem 5.1 suggests that optimally scaled OL subsam¬ 
ples should be larger than the NOL ones by a scalar: > 1 where 

Ki = iAol-^ol is the limiting ratio of variances from (3.1). 

It is well known in the time series case that the OL subsampling scheme 
produces an asymptotically more efficient variance estimator than its NOL 
counterpart. We can now quantify the relative efficiency of the two sub¬ 
sampling procedures in d-dimensional sampling space. With each variance 
estimator respectively optimized using (4.1), ql more efficient than 

r^NOL asymptotic relative efficiency {AREd) of t^nol "^nOL 

depends solely on the geometry of Rq, 

ARE,= lim < 1. 

n^OC K{ 1^ 

'^V'n,NOL 'nJ 

Possolo (1991), Politis and Romano (1993a, 1994), Hall and Jing (1996) and 
Garcia-Soidan and Hall (1997) have examined subsampling with rectangu¬ 
lar regions based essentially on i?o = (~l/2,1/2]'^. Using the geometrical 
characteristic Ki = (f)'^ for rectangles, we can now examine the effect of 
the sampling dimension on the AREj, of j,jql to ql for these sampling 
regions. Although the ARE^i decreases as the dimension d increases, we find 
the relative improvement of ql over js^ql is ultimately limited and the 
ARE^ has a lower bound of 4/9 for all M'^-rectangular regions. 




sK,NOL={ - ^^4 - ) (l + o(l))- 
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5.2. Theoretical optimal subsample shapes. We conclude this section by 
addressing a question raised by a referee on subsample shape selection. Al¬ 
though not widely considered in the literature, subsample variance estima¬ 
tors are also possible by using subsamples of a freely chosen shape, rather 
than scaled-down copies of Rn- Nordman and Lahiri (2003) discuss compar¬ 
ing variance estimators, based on differently shaped subsamples, through 
their asymptotic relative efficiency. This involves finding MSE expansions 
for estimators with OL, NOT subsamples of an arbitrary shape with opti¬ 
mal scaling (e.g., modified versions of Theorems 3.1, 4.1 and 5.1). However, 
because both the subsample geometry and the r.f. covariances influence a 
subsample estimator’s bias (see Section 6), a direct comparison of asymp¬ 
totic MSEs to choose an optimal subsample shape can become complicated, 
especially for OL subsamples. 

Eor illustration, consider selecting between circular and rectangular sub¬ 
samples for sample mean 6 n = G M variance estimation on a rectangular 
region C under a Gaussian isotropic covariogram, 

cr(k) = exp(— /3||k|p), k G 

The value of /3 heavily affects the large sample performances of circles and 
rectangles (i.e., scaled-down copies of Rn) as subsamples and makes the 
choice of subsample shape difficult. Eor example, the asymptotic efficiency of 
circular to rectangular OL (NOL) subsamples is 0.9259 (1.0274) for (3 = 0.2 
and 1.0758 (1.1937) for (3 = 2. We conducted a small simulation study of the 
finite sample efficiencies of these subsample shapes on several rectangular Rn 
to compare with the asymptotic values. The results in Table 4 indicate that 
the asymptotic advantages of a subsample shape may also not be readily 
apparent in finite samples due to edge effects. See Nordman and Lahiri 
(2003) for further details and examples on the effect of subsample shape for 
variance estimation. 

6. Examples. We now provide some examples of the important quanti¬ 
ties Kq, Ki, Bq associated with optimal scaling with some common 

sampling region templates, determined from Theorems 3.1 and 4.2. For sub¬ 
samples from (4.1), the theoretically best sA°p^ can also be formulated in 
terms of \Rn\ = det(A„)|i?o| (sampling region volume), Ki and Bq. 

6.1. Examples in M^. 

Example 1. Rectangular regions in (potentially rotated); if 
Rq = {((^1 COS0, 12 sin0)x, (— /i sin0 ,12 cos0)x)^: x G (—1/2,1/2]^} 
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Table 4 

Minimal normalized MSE E(f^/r^ — 1)^ for OL /NOL subsample estimators of 
sample mean variance = Nn'Va,T{ZNn) on RnPil? , uiii/i (j(k) = exp(—/3||k|p), k G 
(based on 1000 simulations). Rectangular (rec.) and circular (cir.) subsamples s^^F^Ro 
are based on Rq — (—1/2,1/2]^, {x G : ||x|| < 1/2} using optimal scaling sAJ/’* (an 
integer listed beside each MSE). Estimated relative efficiencies (RE) of cir. versus rec. 

subsamples are also listed 



rec. subsamples 

cir. subsamples 

cir./rec. RE 

Rn 

OL 

NOL 

OL 

NOL 

OL 

NOL 

(-5,5)2 

0.4295 (4) 

0.4261 (5) 

/3 = 0.2 

0.4519 (2) 

0.4286 (5) 

1.0521 

1.0060 

(-10,10)2 

0.2329 (5) 

0.2183 (5) 

0.2418 (5) 

0.2328 (5) 

1.0384 

1.0661 

(-30,30)2 

0.0806 (10) 

0.0842 (10) 

0.0835 (9) 

0.0944 (9) 

1.0355 

1.1260 

(-50,50)2 

0.0482 (14) 

0.0562 (11) 

0.0462 (15) 

0.0601 (11) 

0.9585 

1.0698 

(-5,5)2 

0.0841 (2) 

0.0978 (2) 

/3 = 2 

0.1170 (2) 

0.1426 (1) 

1.3890 

1.4570 

(-10,10)2 

0.0436 (3) 

0.0515 (2) 

0.0436 (3) 

0.0641 (3) 

1.0000 

1.2432 

(-30,30)2 

0.0128 (5) 

0.0162 (4) 

0.0138 (5) 

0.0199 (5) 

1.0771 

1.2260 

(-50,50)2 

0.0082 (6) 

0.0111 (5) 

0.0092 (7) 

0.0129 (5) 

1.1139 

1.1594 


for0G[O,7r], 0 < Zi, ( 2 , then 
4 


Kn = 


9’ 


Bo= i: 

kGZ2 

k=(A:i,fc2)' 


^ |A;i COS 0 — Zc2 sin^l ^ \kism9 + k2Cos6\ 

V h ^ T2 


o-(k). 


The characteristics Ki, Bq for determining optimal subsamples based on 
two rectangular templates, including a diamond-shaped region (i.e., 9 = 7r/4, 
li=l2 = llV2), are further described in Table 5. 


Example 2. If //q is a circle of radius r < 1/2 centered at the origin, 
then Kq appears in Table 3 and Bq = l|h||o'(k). 

Example 3. For any triangle, Kq = 2/5. Two examples are provided in 
Tables 2 and 5. 


Example 4. If i?o is a regular hexagon, centered at the origin and with 
side length I < 1/2, then 

Ko = 1^, .Bo = ^ (|/C 2 | +max{\/3|/ci|, |A; 2 |})o-(k). 

^ kGZ2 
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Table 5 

Examples of several shapes of Ro C and associated Ki, Bo for sAJJ*’* 


Ro 

ifi 

Bo 

(-1/2,1/2]^ 

4/9 

Ekez2 l|k||ia(k) 

Circle of radius 1/2 at origin 

7r/4 — 4 /(Stt) 

llk|k(k) 

Diamond in Figure l(i) 

2/9 

2 Ek«k|k|locu(k) 

Right triangle in Figure l(ii) 

1/5 

Table 2 

Triangle in Figure l(iii) 

1/5 

Ek 6 z 2 (k 2 | -tmax{ 2 |fci|,|fc 2 |})o-(k) 

Parallelogram in Figure l(iv) 

2/9 + (y5-l)/375 

4/k5Ek6z2(ki - 2fe|/5 + |fc2|)a(k) 


Example 5. For any parallelogram in with interior angle 7 and ad¬ 
jacent sides of ratio 6 > 1, Kq = 4/9 -|- 2/15 • b~‘^ \ cos 7 |(l — | COS 7 I). In partic¬ 
ular, if a parallelogram Rq is formed by two vectors ( 0 , /i)', {I 2 cosy, I 2 siny)' 
extended from a point x G (— 1 / 2 , 1 / 2 ]^, then 

1 / |fci-|cos 6 l|-fc 2 -|sin 6 >|| 1 ^ 2 ! \ . 

° |sin6»|jJ^2V max{/i,Z2} min{/i,/2}/ 

7 G ( 0 , 7 r), li,l 2 > 0. 

For further bias term Bq calculation tools with more general (noncon- 
vex) sampling regions and templates Rq (represented as the union of two 
approximately convex sets), see Nordman (2002). 

6.2. Examples in d>3. 

Example 6 . For any sphere, Kq is given in Table 3. The properties Bq, 
Ki of the sphere described in Tables 1 and 2 correspond to the template 
sphere Rq of radius 1/2 with maximal volume in (—1/2,1/2]^. 

Example 7. The Kq value for any cylinder appears in Table 3. If 
Rq is a cylinder with circular base (parallel to the x-y plane) of radius r 




1 

-/ 

V 

--i 

1 / 
l/ 




r/ 

1 / 



\ 1 
Y 


h 




-- 

/i 

1 / 

vJ~ 



- - -I 

\i 

1/ 



1 



Eig. 1. Examples of templates Ro C (—1/2,1/2]^ are outlined by solid lines. Cross-shaped 
sampling regions described in Table 2 are based on Rq in (v). 
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and height h, then 


Bo= i: 

kez3 

k={kuk2MY 



‘2\/ kf + ^2 

vrr 


cj(k). 


The results of Theorem 4.2 for determining the bias Bq also seem plausible 
for convex sampling regions in d > 4, but require further study of lattice 
point counting techniques in higher dimensions. However, bias expansions 
of the OL and NOT subsample variance estimators are relatively straight¬ 
forward for an important class of rectangular sampling regions based on the 
prototype Rq = (—1/2,1/2]'^, which can then be used in optimal subsam¬ 
ple scaling. These hypercubes have “faces” parallel to the coordinate axes, 
which simplifies the task of connting sampling sites, or lattice points, within 
such regions. We give precise bias expansions in the following theorem, while 
allowing for potentially missing sampling sites at the border of the sampling 
region Rn- 

Theorem 6.1. Let (-1/2,1/2)'^ c A^^Rq C [-1/2,1/2]-^, d>3, for a 
d X d diagonal matrix Ai with entries 0<^j<l, i = l,...,d. Suppose gRn = 
s^nRo o.nd Assumptions A.2-A.5, Conditions D 2 and M 2 +a hold with a as 
specified under Condition D 2 . Then the biases E(f^ q^) — E(f^ nol) “ '^n 
are equal to —sAn^Ho(l-|-o(l)) where 

^0 = E f E ^(k) = Cov(V'Z(t), V'Z(t + k)). 

\ i=l / 

Example 8. For rectangular sampling regions Rn = A„(—1/2,1/2]“^, op¬ 
timal subsamples (4.1) may be chosen with 

s 2x l/(d+2) 

l|k||ia(k)j j (1 + 0(1)) 

or 

\opt _ \opt f3\d/[d+2) 

OL — s'^n,NOL\ 2 ) ’ 

using the template Rq = (—1/2,1/2]*^. 

7. Empirical subsample size determination. This section considers data- 
based estimation of the theoretical optimal scaling factor for subsam¬ 

ples as in (4.1). We describe two estimation techniques for this. One approach 
involves using “plug-in” estimates and the second involves minimizing an es¬ 
timated MSE criterion function. In Section 8 we evaluate both estimation 
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methods for through a simulation study. Inference on “best” subsample 
scaling closely resembles the problem of empirically gauging the theoretically 
optimal block length with the MBB variance estimator. With time series, 
estimation rules of optimal MBB block size have been developed using both 
plug-in and empirical MSE methods [cf. Biihlmann and Kiinsch (1999) and 
Hall, Horowitz and Jing (1995)]. 

Hall and Jing (1996) give a method for estimating optimal subsample scal¬ 
ing through minimization of an estimated MSE function in the time series 
case. Considering OL subsamples hrst, we adapt this approach (hereafter the 
HJ method) for spatial subsampling as follows. We determine the template 
i?o as the largest set of the form within (—1/2,1/2]*^. Let JoL(Am) 

denote a collection of OL subsamples using a scaling factor = Xm > 0 
in (4.1). Here Xm is a “smoothing parameter.” We treat each subsample 
in JoL(Am) as a scale XmRo sampling region on which an OL subsample 
variance estimator, with subsample scaling gXm < Xm, can be computed. De¬ 
note the resulting variance estimates as qlj * = 1, • • ■ j |<JoL(Am)|- Write 

OL = Variance estimator computed on the region with 

subsample scaling Xm- An estimate of the MSE when using subsamples of 
size sXmRo on regions of size XmRo is the average of the squared differences 

OL ~ '^n,OL(^m))^- We then select the value of Am, say which 

minimizes this data-based MSE and take 

We use Theorem 5.1 to appropriately recalibrate an estimate <jA°p* to es¬ 
timate optimal subsample scaling for i?„-size regions. For optimal scal¬ 
ing estimation with NOL subsamples, we replace OL("^m), OL with 

NOL('^m)!L^m NOL above. Garcia-Soidan and Hall (1997) apply a similar 
empirical MSE selection procedure with subsample-based distribution esti¬ 
mators on rectangular sampling regions in 

An advantage of a plug-in estimate of scaling is that it is computationally 
less demanding than minimization of an estimated MSE. A nonparamet- 
ric plug-in (NPI) procedure involves substituting estimates of unknown r.f. 
parameters appearing in <jA°P* from Theorem 5.1. To do this, we propose 
using subsample variance estimators based on two smoothing parameter 
choices. Let f^{sXn) denote a subsample variance estimator with scaling 
An in (4:.l). Using a pilot scalar An^ = ci\Rn\^^^'^^‘^\ ci > 0, we estimate 
the limiting variance appearing in An'’* with f^(sAk*^). With a second 
smoothing parameter ^A^^^ = , C 2 > 0, we estimate the bias com¬ 

ponent Bo with Bq = 2sA^^^[f^(2sA^^^ )~^n{sX^n ^)]• It follows easily from The¬ 
orems 3.1-4.1 that the estimator Bq is consistent when the bias of f^(sA„) is 
—An*Uo(l + o(l)). With time series d = 1, Lahiri, Furukawa and Lee (2003) 


24 


D. J. NORDMAN AND S. N. LAHIRI 


suggest a similar bias estimate for the MBB variance estimator and show the 
order of s^n'^ above is asymptotically optimal. Politis and Romano (1995) 
also consider combining two subsample estimators in kernel spectral density 
estimation. We conjecture that the order is optimal for minimizing the 
asymptotic MSE in estimating Bq with spatial subsampling (d > 2) and this 
can be established for rectangular sampling regions. 

For subsample variance estimation of a time series mean, other plug-in 
rules for sA°P* are given in Carlstein (1986) [with AR(1) models], Leger, 
Politis and Romano (1992) and Politis and Romano (1993b). 

8. Numerical studies. 

8.1. Performance comparison of sub sample types. We conducted a sim¬ 
ulation study to compare the finite sample performances of OL and NOL 
subsample variance estimators of Var(d„), where 6 n = is the 

real-valued sample mean over a sampling region Rn C M^. Rectangular and 
circular regions Rn of two different sizes were considered: 

Rn:={-7,7] X (-9,9], R„:=(-15,15] x (-21,21], 

Rn ■= {x G : ||x|| < 9}, Rn ■= {x G R^ : ||x|| < 20}. 

The smaller (larger) circle contains one integer point more (seven less) 
than the smaller (larger) rectangle. The rectangular regions have approxi¬ 
mately the same ratio of side lengths. 

Using the algorithm of Chan and Wood (1997), we generated mean zero 
Gaussian random fields on with one of the following covariance structures: 

Model E(/3 i,/32 ) :o-(k) =exp[-Pi\ki\ -^21^1], 

(8.1) Model G(/3 i,/32 ) : fT(k) = exp[-/3i|A:i|^ - /32|A:2p], 

k=(A:i,A:2)'GZ2,/3i,/32>0. 

Models E and G correspond to exponential and Gaussian covariograms, 
respectively. We consider the values {^ 1 ,^ 2 ) = (0.5,0.3), (1,1) in both models 
to obtain isotropic and anisotropic covariograms exhibiting various rates of 
decay. 

For each i?„ and covariance structure, we considered various amounts of 
subsample scaling in the estimator = f^(sA„) based on OL or NOL 
subsamples. Here rectangular and circular subsamples correspond to trans¬ 
lates of sAnRo for Rq = (—1/2,1/2]^, {x G R^ : ||x|| < 1/2}. We estimated the 
normalized MSE, E(f^/T^ — 1)^, listing results in Table 6 for Model E. (To 
save space, we omit similar tables for Model G, where the performance of 
the estimators was better.) Estimates of optimal scaling appear in Table 7. 
From these simulation results, we make the following observations: 
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Table 6 

Normalized MSE E(r^/r^ — 1)^ for OL /NOL subsample variance estimators of 
= A^rtVar(^jv„) on R„r\l? (based on 10,000 simulations). An asterisk (*) denotes a 

minimal MSE 



E(0.5, 

0.3) 

E(l,l) 

E(0.5, 

0.3) 

E(l,l) 

s^n 

OL 

NOL 

OL 

NOL 

OL 

NOL 

OL 

NOL 


R 

:„ = (-7,7] X (-9,9] 


Rr. 

, = {x G R 

^:|lx||<9} 

1 

0.9074 

0.9074 

0.5855 

0.5855 

0.9075 

0.9075 

0.5871 

0.5871 

2 

0.7645 

0.7619 

0.3312 

0.3298 

0.7413 

0.7417 

0.3303 

0.3330 

3 

0.6367 

0.6343 

0.2201 

0.2264 

0.6386 

0.6378 

0.2252 

0.2346* 

4 

0.5490 

0.5470 

0.1926* 

0.2191* 

0.5991 

0.6177 

0.2332 

0.2897 

5 

0.5051 

0.5344 

0.2106 

0.3071 

0.5255 

0.5627 

0.2126* 

0.3444 

6 

0.4999* 

0.4605* 

0.2533 

0.2911 

0.5246* 

0.4978* 

0.2567 

0.3369 

7 

0.5242 

0.4957 

0.3086 

0.4004 

0.5311 


0.2925 



Rn ■- 

= (-15,15] X (-21,21] 

Rn 

= {x G R" 

b||x|| <20} 

4 

0.5290 

0.5285 

0.1820 

0.1851 

0.5849 

0.5846 

0.1825 

0.1866 

5 

0.4370 

0.4329 

0.1170 

0.1232 

0.4743 

0.4785 

0.1186 

0.1332 

6 

0.3693 

0.3601 

0.1115 

0.1380 

0.4180 

0.4236 

0.1119 

0.1358 

7 

0.3226 

0.3132 

0.0983* 

0.1172* 

0.3698 

0.3716 

0.1007* 

0.1257* 

8 

0.2931 

0.2963 

0.1061 

0.1453 

0.3313 

0.3466 

0.1055 

0.1596 

9 

0.2777 

0.2822 

0.1085 

0.1613 

0.2901 

0.3333 

0.1119 

0.2080 

10 

0.2734* 

0.2542* 

0.1298 

0.2247 

0.2849 

0.3084* 

0.1254 

0.2049 

11 

0.2779 

0.3454 

0.1388 

0.2824 

0.2803* 

0.3814 

0.1397 

0.3335 

12 

0.2891 

0.3298 

0.1680 

0.2889 

0.2868 

0.3662 

0.1596 

0.3359 


1. At optimal scaling, the MSEs of OL and NOL subsamples were similar. 
Under the strongest r.f. dependence in Model E(0.5,0.3), NOL subsam¬ 
ples performed better. For the other covariogram models entailing weaker 
dependence, OL subsamples were always better. 

2. Unlike with OL subsamples, the MSEs with NOL subsamples increased 
more rapidly when optimal scaling was not used. This implies estimation 
of sA°U with OL subsamples is preferable. 


Table 7 

Optimal subsample scaling for variance estimation of sample mean y/NnZN„ 
(determined from 10,000 simulations) 


Rn 

E(0.5 

,0.3) 

G(0.5 

,0.3) 

E(l,l) 

G(l,l) 

OL 

NOL 

OL 

NOL 

OL 

NOL 

OL 

NOL 

(-7,7] X (-9,9] 

6 

6 

4 

4 

4 

4 

3 

3 

(-15,15] X (-21,21] 

10 

10 

7 

6 

7 

6 

5 

5 

{xgRL]|x]| <9} 

6 

6 

5 

3 

5 

3 

3 

3 

{xgRL X <20} 

11 

10 

7 

7 

7 

7 

5 

5 
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3. Table 7 shows that OL and NOT optimal scaling tended to be the same. 
NOT subsample scaling becomes clearly smaller in larger sample sizes; 
see also Table 4. 

4. Optimal subsample scaling also decreased as the r.f. dependence structure 
weakened (e.g., faster decay of covariogram). In this case, the performance 
of the variance estimators also improved. 


8.2. Comparison of scaling estimation methods. We also compared NPI 
and HJ estimation methods for scaling s^^oi, with OL subsamples, using 
the covariogram models and sampling regions Rn from Section 8.1. We again 
took the sample mean 6n = For the NPI method, we chose smoothing 
parameters ci,C 2 G {0.5,1,2}. For each Rn, we used two pilot subsample 
sizes \m for the HJ method. As a measure of performance of the NPI and 
HJ procedures, we considered the following quantity: 


( 8 . 2 ) 




where denotes the OL subsample variance estimator nsing scal¬ 
ing s^n, s-^n^oL represents an estimate of optimal scaling and 

is the variance parameter. Hence, (fn measures the relative deviation of an 
OL subsample estimator of based on estimated scaling compared to the 
“best” OL subsample estimator. Values of (/>„ near zero would suggest that 


'n,OL 

'n,OL 


(sA°^ql) performed nearly as well as the optimal subsample estimator 

{sX°h- 


From the results reported partially in Table 8, the choices of smoothing 
parameters 


C 2 = 0.5 and ci G {0.5,1} 

gave good results for estimating iri the NPI approach. We recom¬ 

mend these values for implementing the NPI method. The HJ method also 
tended to perform better with smaller smoothing parameter choices 
which agrees with the Am selections of Hall and Jing (1996) for time se¬ 
ries. (We chose Am so that an estimated MSE could be maximized over at 
least five different sXm arguments.) Table 9 gives frequency distribntions of 
estimated optimal scaling under other covariogram models and re¬ 

gions Rn ■ Table 7 lists values of <jA°^ql . These results indicate that the NPI 
and HJ procedures exhibit good finite sample properties in estimating sA°p* 
and are competitive. 
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Table 8 

Values for NPI and HJ methods (each based on 1000 simulations), where is as 

in (8.2). HJ method uses (Ami, Amj) = (5,10), (7,14), (3,6), (4,8), respectively, on regions Rn from 
left to right. Minimal MSE is denoted with an asterisk for each Rn and covariogram model 


Rn 

Cl C2 

(-7,7] X (-9,9] 

(-15,15] X (-21,21] 

{x G : 

l|x|| <9} 

{x G : 

l|x|| <20} 

E(0.5,0.3)G(1,1) 

E(0.5,0.3) 

G(l,l) 

E(0.5,0.3) G(l,l) 

E(0.5,0.3) G(l,l) 

0.5 0.5 

0.0022* 

0.0106 

0.0025 

0.0075 

0.0013 

0.0093 

0.0015 

0.0034 

1 

0.0654 

0.0614 

0.0296 

0.0288 

0.0405 

0.0559 

0.0139 

0.0266 

2 

0.0703 

0.2470 

0.1044 

0.1000 

0.0405 

0.2532 

0.1628 

0.0862 

1 0.5 

0.0299 

0.0031* 

0.0101 

0.0027 

0.0118 

0.0047* 

0.0334 

0.0011* 

1 

0.0065 

0.0706 

0.0019* 

0.0206 

0.0030 

0.0644 

0.0006* 

0.0192 

2 

0.0412 

0.2040 

0.0317 

0.0968 

0.0233 

0.2098 

0.0600 

0.0911 

2 0.5 

0.0412 

0.0352 

0.0369 

0.0055 

0.0212 

0.0205 

0.0709 

0.0029 

1 

0.0040 

0.1081 

0.0051 

0.0157 

0.0010 

0.0961 

0.0152 

0.0133 

2 

0.0439 

0.2582 

0.0278 

0.1346 

0.0255 

0.2676 

0.0134 

0.1206 

HJ, 

0.0100 

0.0098 

0.0161 

0.0001* 

0.0001* 

0.0709 

0.0334 

0.0288 

H J, \m2 

0.0178 

0.1766 

0.0048 

0.0130 

0.0069 

0.0360 

0.0630 

0.0337 


Table 9 

Frequency distribution of estimated optimal OL subsample scaling with NPI and HJ 
methods (based on 1000 simulations). Along with C 2 =0.5, NPIl and NPI2 use ci =0.5 
and 1, respectively. True optimal scaling values are given in Table 7 




Estimates 

opt 

ti,OL 

of optimal scaling ^ 

\ opt 
^ti,OL 

iirt/Model 

Method 

2 3 

4 

5 

6 

7 

8 

9 10 

(-7,7] X (-9,9] 

NPIl 

98 

901 

1 





E(l.l) 

NPI2 

307 

686 

7 






HJ, Am = 5 

150 

850 






|x G R^ : llxll < 9| 

NPIl 


7 

993 





G(0.5,0.3) 

NPI2 

HJ, Am = 3 


876 

124 

963 


37 


(-15,15] X (-21,21] 

NPIl 


2 

9 

62 

276 

450 

192 9 

E(l.l) 

NPI2 


1 

14 

241 

726 

18 



HJ, Am = 7 

1 


856 


143 



{xgRT||x|| <20} 

NPIl 



2 

21 

272 

590 

115 

G(0.5,0.3) 

NPI2 



2 

134 

864 




HJ, Am = 4 





723 


277 


9. Proofs for variance expansions. For the proofs, we use C to denote 
generic positive constants that do not depend on n or any integers (or 
Z'^ lattice points). The real number r, appearing in some proofs, always 
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assumes the value stated under Condition My with respect to the lemma 
or theorem under consideration. Unless otherwise specified, limits in order 
symbols are taken letting n tend to infinity. 

In the following, we denote the indicator function as Ij.} (i.e., Ij.} e {0,1} 
and Ijyi} = 1 if and only if an event A holds). For two sequences {s„} and 
{tn} of positive real numbers, we write Sn ^ tn if Sn/^n —> 1 as n —> oo. 
We write and for the largest diagonal entries of A„ and sA„, 

(n) 

respectively, while sA^^j^ > 1 will denote the smallest diagonal entry of sA„. 
We require a few lemmas for the proofs. 

Lemma 9.1. Suppose Ti,T 2 C = t + are bounded. Let p,q> 
where 1/p + 1/q < 1. If Xi, X 2 are random variables, with Xi measurable 
with respect to iFz{Ti),i = 1,2, then 

|Cov(Ai,A 2)| <8(E|Aini/P(E|X2r)i/%(dis(ri,r2);max|ri|y 

provided expectations are finite and dis(Ti,T 2 ) > 0. 

The proof follows from Theorem 3, Doukhan [(1994), page 9]. 


Lemma 9.2. Let r £ Z_|_. Under Assumption A.3 and Condition My, for 
1 <m <2r and any T C Z'^ = t + , 


E 


sGT 


<c(a)|rr/2. 


C{a) is a constant that depends only on the coefficients a{k,l), I < 2r, and 
E||Z(t)||2'-+^. 


The proof follows from Theorem 1, Doukhan [(1994), pages 26-31] and 
Jensen’s inequality. 

We next determine the asymptotic sizes of important sets relevant to the 
sampling or subsampling designs. 


Lemma 9.3. Under Assumptions A.l and A.2, the number of sampling 
sites within: 

(a) the sampling region R^: Nn = \Rn H Z'^| ~ |i?o| • det(A„); 

(b) an OL subsample, Ri^n, i £ : sXn ~ \Ro\ ■ det(sA„); 

(c) a NOL subsample, Ri^n, i £ Jnol : sA^i.n ~ j-Roj • det(sA„). 

The number of: 

(d) OL subsamples within Rn : |^ol| ~ j-Roj • det(A„); 

(e) NOL subsamples within Rn ; | JnolI ~ j-Roj ■ det(A„) • det(sA„)“^; 
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(f) sampling sites near the border of a subsample, or R\^n, is less 
than 


sup|{j G : rJ n T^0,T^n Rl^ / 0 /or TJ = j + [-2,2]‘^}| 
iezd- 

<c{sXTl'^~^- 


Results follow from the boundary condition on Rq-, see Nordman (2002) for 
more details. 

We require the next lemma for counting the number of subsampling re¬ 
gions which are separated by an appropriately “small” integer translate; we 
shall apply this lemma in the proof of Theorem 3.1. For k = {ki ,..., kd)' G 
define the following sets: 

'^n(k) = I {i G Jql : i + k -|- g/X^Ro C Rn}| > 

En = {k€Z'^:\kj\<gX^;^\ j = l,...,d}. 


Lemma 9.4. 


Under Assumption K.2, 

Jn(k) 


max 

keEn 


1 - 


I-^olI 


0 ( 1 ). 


Proof. For k G En, write the set J)((k) and bound its cardinality 
J*{k) = |{i G JoL : (i + k + gAM n / 0}| 

<|{iGZ'':r‘nA„i^/0, rnA„7^^0; r = i + Ar"[-2,2]''}| 

by the boundary condition on Rq. We have then that for all k G E^, 

IJolI > Jn(k) = IJolI - J:(k) > \JoL\-CgXr^iXr^y-\ 

By Assumption A.2 and the growth rate of | JqlI from Lemma 9.3, the proof 
is complete. □ 


We now provide a theorem which captures the main contribution to the 
asymptotic variance expansion of the OL subsample variance estimator 
from Theorem 3.1. 


Theorem 9.1. For i G Z'^, let — fi). Under the assump¬ 

tions and conditions of Theorem 3.1 

gNn Y. CoyiY^^^,YY) = Ko-[2T^]il + o{l)), 

keEn 

where the constant Kq is defined in Theorem 3.1. 
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Proof. We give only a sketch of the important features; for more de¬ 
tails, see Nordman (2002). For a set T C define the function S(-) as 

s(r)= ^ 

sGZ'^nr 

With the set intersection = sRn H (k -|- sRn), k G write functions 

i/l„(k) = S(i?k,n\4'l)> 
if2n(k) = S(i?o,n\4'l)> 

i/3n(k) = S(i?® ). 

These represent, respectively, sums over sites in i?k,n but not Ro,n = sRn, 
Ro,n but not i?k,n and both i?o,n and i?k,n- Then define /i„(-): —> M as 

hn{h) = E[Hlik)nHl{k)] + E[Hlik)nHlik)] 

+ E[Hl{k)]E[HUk)]+E[Hl^{k)] - (,iV„)^[E(yo',J]2. 

We will make use of the following proposition. 

Proposition 9.1. Under the assumptions and conditions of Theorem 3.1, 

max|(,,iV„)2Cov(yo^,n>b"k,n) “ {sNn)~‘^hnik)\ =o(l). 
kGF;„ 

The proof of Proposition 9.1 can be found in Nordman (2002) and involves 
cutting out lattice points near the borders of i?o,n and i?k,n) say, Po,n 
and i3k,n with 

(9.1) _ _ 

= {i G : i G (i + 4(-l, I]'") n Pf „ / 0}, j G Z^ 

where e = {K6/{{2r + 6){2r — 1 — 1/d)} + l)/2 < 1 from Condition M^. 
Here in oo, in = o(s-^|^n) chosen so that the remaining observations 
in Ro,n, Rk,n, R^\ are nearly independent upon removing Bo n,Bk,n points 
and, using the i?o-boundary condition, the set cardinalities |Ho,n|, |Hk,n| < 
are of smaller order than gNn (namely, these sets are asymp¬ 
totically negligible in size). 

By Proposition 9.1 and \En\ = 0{sNn), we have 

sNn Y. Cov{YY,YY)-isNn)-^ E ^nik) 

kG-E/i k^En 


(9.2) 


0 ( 1 ). 
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Consequently, we need only focus on (s-/Vn)~^ SkeEn complete 

the proof of Theorem 9.1. 

For measurability reasons, we create a set dehned in terms of the 
Lebesgue measure, 

E+^ (0,1) n {e < : 


|{xeM'^:|(x + Aoi?o)n Aoi?o| =e or det(Ao)|i?o| -e}| =0 


Note the set (0,1) n (0, det(Ao)|i?o|/2) \E+ is at most countable [cf. Billings¬ 
ley (1986), Theorem 10.4]. For e G E"*", define a new set as a function of e 
and re: 


= >e(A 


I 




\ I 

\ ^k.,n\ 




Here „ c E„ because k ^ implies 1 — 0- 

We now further simplify (sA^n)”^ Z)kGE„ ^n(k) using the following propo¬ 
sition involving Re^n- 


Proposition 9.2. There exist N G Z+ and a function b{-): E+ ^ (0,oo) 
such that 6(e) J, 0 as e | 0 and 


(9.3) (Wn)-' E ^-(k) <C(e + (AS"E' + [6(e)f), 

kGE„ kGi?E.n 

where C > 0 does not depend on e € E+ or re > N. 


The proof of Proposition 9.2 is tedious and given in Nordman (2002). The 
argument involves bounding the sum of hn{-) over two separate sets in En- 
those integers in En that are either “too large” or “too small” in magnitude 
to be included in Rs,n- 

To finish the proof, our approach (for an arbitrary e G E"*") will be to write 
{sNn)~^ SkeRg „ ^n(k) as an integral of a step function fe,n{^) with respect 
to the Lebesgue measure, then show lim^^oo/e,n(x) exists almost every¬ 
where (a.e.) on and apply the Lebesgue dominated convergence theorem 
(LDCT). By letting e | 0, we will obtain the limit of s^n Z)kGE„ Cov(yE 5 ^E) 
Eix e G E"*". With counting arguments based on the boundary condition 
of Rq and the definition of Re,n-, it holds that for some Ng G Z_|_ and all 

k G Re,n- l-Pk n ^ “ l-^kn ^ ^ when n > Ng. We can 

rewrite (<jN„)“^hn(k), k G Re,n, in the well-defined form (for re > N^) 


hn(k) 

isNnV 


= E 


HLik) 


lsNn-\RiinZd\ 


E 


Hinik) 


.iVn-|<Ez"lJ 


sNn 


2 
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+i;e 

i=i 


^n(k) 


HLik) 

.Nn-\R^^lrz<i\r[\R^lnzd\} 


E 


X 1- 


l4'>„nz'<| 


sNr,. 


+ E 




Iflk'lnzil 

sNn 
d\\2 


l<lnz 

N 

s-^-^n 


- [sNnE{Y^^jf 


For X = (xi,..., XdY G K'^, write [xj = ([xij,..., [xd\)' G and x„ = 


[sA^^^xJ. Let /£,„(x):] 


be the step function defined as 


fs,n{'^) — (s-An) 

We have then that (with the same fixed e G E"*") 


(9.4) 


1 ( 

^ (.iV^-^h^k) = 1 ^ 


.W, 


kEi?e,n 
We focus on showing 


N 

s-‘-^n 


/ /£,n(x) 


hx. 




(9.5) 


= I 


{xG.R£} 


[2r^ 


4l f I ( x + Api^o) n Api^ol 
det(Ao)|i?o| 


a.e. X G 


with .Re = {x G |(x + Aoi?o) H Aoi?o| > |Aoi?o \ (x + Aoi?o)| > e} a 
Borel measurable set. 

To establish (9.5), we begin by showing convergence of indicator functions 

hd 


(9.6) 


^{x„&Re,-n.} -^{xSRe} 


a.e. X G. 


Dehnethesets An(x) = (sA^”^) ^{(x„T^iZn)^n(x) = {(sA^”^) ^sRn}\ 
T„(x) as a function of x G M'^. The LDCT can be applied to show that for 
each X G |^n(x)| |(x + Aq/^o) n Aoi?o| and |^n(x)| ^ |AoRo \ (x + 

Aoi?o)|- Thus, if X G i?e, then 


|^n(x)| ^ |(x + Aoi?o)n AqRoI >e, 
|i„(x)| ^ |Aoi?o \ (x + AoRo)| > e, 


implying further that 1 = = 1 as n —> oo. Now consider 

i?£. If X ^ i?e such that |(x +AqRo) Li AqRoI < £ [or | AoRo\ (x +Aoi?o)| < 
then |^ji(x)| < £ [or |A„(x)| < e] eventually for large n and 0 = .I{x„g.Re n} 
^{xgAe} = 0 ^ this case. Finally, £ G E^ implies that a last possible subset 
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of has Lebesgue measure zero; namely, |{x G : |(x +Aq-Rq) ^ 

or IAqRo \ (x + AqRo)! = £}| = 0- We have now proven (9.6). 

We next establish a limit for (Wn)~^^n(xn), x G Re. We wish to show 

((^o\ |Rxi,nnZ'^| I(x + AqRo) n AqRoI ^5 

~^n -’-det{A„)|%|-’ 


Using the bound | |R® ,n| — |Rx],nnZ'^| | < from the Ro-boundary 

condition and noting the limit in (9.7) for (sA^"'^)“'^|R®,ri| = |^n(x)|, we 
find (sA^"'^)“'^|Rxi,n H Z*^! —> |(x + AqRo) H AqRoI, x G Rg. By this and 
[sX^^Y/sNn (det(Ao)|Ro|)~^ (9.8) follows. 

We can also establish: for each x G Rg, j = 1 or 2, 


E 


(9.9) 


-^3 n(Xn) 

nZ'^lf 


|R 


(I) 


E 




sNn-\R^lnnZ<i\ 


sNnEiYl 


E([V'Zoo]"'), 

• E([V'Z< 


'^oo]^), 


where VZ^o is a normal A7(0,r^) random variable and so it follows that 
E([V'Zoo]^-^) = (2j — 1 )t^-^, j = 1,2. The limits in (9.9) follow essentially 
from the central limit theorem (CLT) of Bolthausen (1982), after verifying 
that the CLT can be applied; see Nordman (2002) for more details. 

Putting (9.6), (9.8) and (9.9) together, we have shown the (a.e.) con¬ 
vergence of the univariate functions fe,n{^) as in (9.5). Eor k G Rn and 
n > Ng, Lemma 9.2 ensures: (sAn)~^|/in(k)| < C, implying that for x G 
K'': |/£,n(x)| < CI|x;g[-c,c]‘^} foi’ some c > 0 by Assumption A.l. With this 
uniform bound on fe,n{') and the limits in (9.5), we can apply the LDCT to 
get 


(9.10) 



eGE+. 


Let {em}m=iU E+ where £^1 0. Then R^^ C Ao[-l, l]‘^and limm^oo7|^g^^^| 
/{xg^o} X / 0 G M'^, with Ro = {x G M'^:0 < |(x -|- AqRo) Cl AqRoI < 
det(Ao)|Ro|}. Hence, by the LDCT, 


(9.11) 


lim 

m—>co 





/o(x) = /|^g^^}[2r4] 


/ 1 (x -L AqRq) n AqRqI 
V det(Ao)|Ro| 


2 
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From (9.2)-(9.4), (9.10) and (9.11) and (sA^”^)'^/sA^„ —> (det(Ao)|iiol) we 

have that 


limsup 

n—^oo 




/o(x)dx 


< limsup 


+ 


^ cov(yo',„,yk',j-^^^ /_ ,/,^,„(x)(ix 


keB„ 


s^n JR'* ' 


det(Ao)|iio| 


+ lim sup 


(.a‘”V 


Nn 

d\ 


[ - f0{^) dx 

JRJ 

[ femA^)d' 

jRd 


^ C{£m[b{£m)] ) + 

^0 as Sm i 0. 


det(Ao)|i?o| 


det(Ao)|i?o| 

/ ,Am(x) -/o(x)(ix 
JR-* 


f fem.{^)dyi 

jRd. 


Finally, 


\Ro\Im 


2t^ f |(y + i?o)ni?oP 


det(Ao)|i2o| 


\Ro\ Vr-* W 


dy, 


using a change of variables y = Ag ^x. This completes the proof of Theo¬ 
rem 9.1. 

□ 

For clarity of exposition, we will prove Theorem 3.1, parts (a) and (b), 
separately for the OL and NOT subsample variance estimators. 


9.1. Proof of Theorem 3.1(a). For i £ JoLj we use a Taylor expansion 
of H{-) (around /z) to rewrite the statistic = Ff(Zi^„), 


f^ = H{^l)+ Y. Co^{Z^,n-Ty 


llall 1 =1 


(9.12) 


+ 2 - - ^ , [ {1 — U>)D°‘H{^ + U!{Zi^n ~ T)) du! 


||a||i=2 

= H{fl) + Yi^n + Qi, 


al Jo 


We also have 

eY = H{fl) + \JoLY Y Yi,n + \JoLY Y Qi,n^H{^^) + Yr, + Q, 

i£JoL ieJoL 
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Then 


Ti,OL ~ s^n 


\Jol\ . 




iSJoL 


I-^OlI ; 


SJoL 


+ 


I-^olI . 


E Yi,nQi,n-Y^-Ql-2{Yn){Q, 


iSJoL 


We establish Theorem 3.1(a) in two parts by showing 
Wn y-2 \ det(sA. 


(a) Var f ^ E • M (1 + o(l)), 


(9.13) 


det(A„) 


(b) 


Var(fEL) - Varl 


..Nr,. 




= o 


/ det(sAn) 

V det(A„) 


We will begin with proving (9.13)(a). For k e Z'^, let cr„(k) = Cov(Yq^,j, 
We write 


{sNrr} 

IdoLp 


■ Var 


('T hi 

ViG JoL 


{sNrrY 


I^olP 

= W^ln + W2n. 


E dn(k)a„(k) + E dn(k)(T„(k) 

keE„ keZ‘‘\E„ / 


By stationarity and Lemma 9.2, we bound |cr,i(k)| < E(yQ^^) < C{sNn) 
k G Z'^. Using this covariance bound, Lemmas 9.3 and 9.4 and \En\ < 3*^ det(sA„), 


(9.14) 


isNr^y 


|d’oL| 


E ^n(k) - Win 


kG£/n 


<C 


\Er, 


■ max 


= o 


|</ol| kGE, 

/ det(5A^) 


1 - 


dn(k) 


|<^ol| 


V det(A„ 

Then applying Theorem 9.1 and Lemma 9.3, 


(9.15) 




I^olI 




det(A^ 


By (9.14) and (9.15), we need only show that W 2 n = o(det(sA„)/det(A„)) 
to finish the proof of (9.13) (a). 

For i G Z'^, denote a set of lattice points within a translated rectangular 
region: 

P,.n = (^i+ n(-r4”V2, r.A<“>l/2|^ nz", 
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where [•] represents the “ceiling” function. Note that for k = (fci,..., kd)' £ 
7J^\En, there exists j £ {1,..., d} such that \kj \ > implying dis(i?o,nn 

Z'^,i2k,nnZ'^) > dis(Fo,n, .Fie, n) > 1- Hence, sequentially using Lemmas 9.1 and 9.2, 
we may bound the covariances crn(k), k £ \ En, with the mixing coeffi¬ 

cient a{-, •), 

|u„(k)| < 8[E(yo'f+^)/0]2"/(""+'^)a(dis(i?o,n n Z^i2k,n n Z''),,iV„)^/("'-+'^) 

< CGiV„)-2a(dis(Fo,„, Fk,„), . 

From the above bound and Jn{^)/\Joi^\ < 1, k £ we have 

oo / d \ 

|fF2n| <C| JolI"' E (T.C(cc,j,n)]a{x,sNnY^^^^^'\ 

x=l Vj=l / 

(9.16) 

C{x,j,n) = |{i e Z'^:dis(Fo,n,Fi^„) = x 

= inf{|uj - Wj\ : V £ Fo,n,w £ Fi^n}}|- 

The function C(^x,j,n) counts the number of translated rectangles Fi^n that 
lie a distance of x £ Z+ from the rectangle Tb,n, where this distance is 
realized in the jth coordinate direction for j = 1,... ,d. For i £ Z'^, x > 1 
and j £ {1,.. .,d}, if dis(Fo,n,Fi^n) = x = inf{|xj - Wj\ : v £ Fo,„,w £ 
then \ij \ = -|-x — 1 with the remaining components of i, namely im for 

m £ {1,..., d} \ {j}, constrained by \im\ < + x. We use this observation 

to further bound the right-hand side of (9.16) by 


C|Jol| ^E(E 


d 

n 


3(,aW+x) «(x,,iV„)^/(2r+5) 


x=l \j=lm=lj^m 


^ ^det(sA„) s^-j 




X = 1 


ai=£n + l 


^ ^ det(sAj2) 

- IdoLl 


din 




n2rd—d 


oo 

E x2”'='-''-'ai(x)^/(2.+5) 

X=(.n + 1 


/ det(sAn) A 
V det(A„) J ’ 


using Assumptions A.l, A.3, Condition Mr and In = o(sA|”„) with e from 
(9.1). This completes the proof of (9.13)(a). 

To establish (9.13)(b), first note that 


Var(fEL) 
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< 






where Ain = VaicisNnY^), ^ 2 n = Var(| JqlI ^s^^nEigJoL Qln)^ = yar{sNnQl), 

^4n — Vai’d JqL I ^ fiYny^Aq 7qj^ hj.nQi T7.)i Acyn = YaV[gNnYnQn) ■ 

By (9.13)(a), it suffices to show that Ajn = o(det(<jA„)/det(A„)) for each 
j = 1,... ,5. We handle only two terms for illustration: Ain, ^ 4 n- 

Consider Ain- For s e n TA, let ti;(s) = [2'^det(5A„)]“^|{i G Jql :s G 
i + sA„i2o}| so that 0 < a;(s) < 1. By Condition and Theorem 3 [Doukhan 
(1994), page 31] (similar to Lemma 9.2), 


^in<E(i;") 


(9.17) 


(2'^det(5A„))'‘ 

\JoL\%Nnr 


u;{s)V{Z{s)-^^) 


■seRnClZd- 


^ (A„)2(det(,A0)" 

\J0L\%Nny ■ 

Then Ain = o(det(sA„)/det(A„)) follows from Lemma 9.3. 

To handle A 4 n, write o-in(k) = Cov(yo,nQo,n, Lk,nQk,n), k G Then 

i AT i^ 

Ain = ,% ”'|2 X] «4(k)0-ln(k) 

I*"'! ks- 


< 


..N„ 


\Jol\ 


X kln(k)| + X 

^ke£;„ keZ'^XE^ ^ 


= Ain{En) + Ain{En)- 

For k G En, note |o-ln(k)| < C{sNn)~^ using |lo,nQo,n| < C||Zo,n - lip(l + 
||^o,n ~1^||“) (from Condition D) with Lemmas 9.1 and 9.2. From this bound. 
Lemma 9.3 and \En\ < 3'^det(sA„), we find Ain{En) = o(det(sA„)/det(A„)). 
We next bound the covariances cJi„(k), k G \ 

|am(k)| < 

X «(dis(i2o,n n Z'^, i?k,n n z'^), 
<C(,A„)-"«(dis(Fo,n,Ek,„),.A„)'^/(2’'+^) 

by the stationarity of the random field Z{-) and Lemmas 9.1 and 9.2. Using 
this inequality and repeating the same steps used to majorize “lT 2 n” from 
the proof of (9.13)(a) [see (9.16)], we have Ain{E^) = o(det(sA„)/det(A„)). 
The proof of Theorem 3.1(a) is complete. 
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9.2. Proof of Theorem 3.1(b). To simplify the counting arguments, we 
assume here integer-valued G implying = s^n, i G The 

more general case, in which the NOT subregions may differ in the number 
of sampling sites, is treated in Nordman (2002). 

For each NOT subregion we denote the corresponding sample mean 

Z (s). The subsample evaluations of the statistic 

of interest, i G JnoLj can be expressed through a Taylor expansion 

of H{-) around /i, substituting Zin for Zin in (9.12): = H{Zi^n) = 

H (/r) -|- Y^n + Qi,n- 

We will complete the proof of Theorem 3.1(b) in two parts by showing 


(9.18) 


(a) 

Var^ 

sNn 

Jnol 

(b) 

Var( 

'n,NOL 


■.E 

ieJNOL 


Y- = 


) — Var 


N 

|</nol| 


det (sA,.^) 

det(A„)|i?o| 



iG Jnol 




/ det(sA^) \ 

V det(A„) ) 


We will begin with showing (9.18) (a). For k G let Jn(k) = {i G Jnol ^ i + 
k G Jnol} and cTn(k) = Cov(YQ^^,yj^^). Then we may express the variance. 


Var 




I Jnol I, 


E m 


iGJNOL 


(9.19) 


isNn)^ 

I Jnol h 


Jn(k)CTn(k) 

VkGZ'i,0<||k||oo<l 

-f Y Jn(k)Jn(k) + I JNOL|Jn(0) I 

kGZ'*.llkll.^>l / 


kGZ‘*,||k||oo>l 

= Vln + V2n + I JnOlI ^ (s.V„)^Cr„(0). 


We first prove t/' 2 n = o(| JnolI ^), noting that det(sA„)/det(A„) = 0(| JnolI ^) 
from Lemma 9.3. 

When k = {ki, ..., G ||k||oo > I, then for some 1 < < J, 

dis(.Ro,n n Z‘^,.Rk,n n Z'^) > max {\kj \ - l)<iA^"'^ 

l<j<d 

If j G {1,..., d},j / mfc, we have 
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Note also if k £ ||k||oo > 1, then 

|a(k)| <C{sNn)-‘^a{{\km,\-l)s\tlsNn) 


by Lemmas 9.1 and 9.2. Hence, we have 

d d 


\U2n\ < 


c 


y 

I’/nolI 




c ^ 

“ I-ZnolI 




\ max 

d—1 I _ 

(n) 




d-l 


< ^ (n) , 2 rd-d-l / , (n) N 5 /( 2 r+ 5 ) 


= o 


1 


'j I --ri j ? ^ ’ 

^k,n =: <! {x G : -1/2 • ,aJ”^ - 4 < < -1/2 • .aJ-’"^}, if = -1, 


. I-^nolI , 

by Assumptions A.l, A.3 and Condition Mr- 

We now show that Uin = o(| Jnol|~^)- For k £ Z'^, 0 < ||k||oo < 1, define 
the set 

{x £ : 1/2 • sXf < Xj < 1/2 • ^xf + 4}, 

• -1/2 • sXf -in< Xj < -1/2 • sXf'^^ 

0, if kj = 0, 

for each coordinate direction j = 1,..., d. Let Tk,n = Uj=i We decom¬ 
pose the sum: sAGLic,n = S(.Rk,n\Fk,n) + S(.Rk,nnrk,n) = 5k,n + <5k,n' Then, 

Uin = o(|Tnol|~^) follows from 1-4 below: 

1. |E(yo',„y„yj| < [E(yo%)E(|ynP)E(|yj3)]V3 = ^(l), using Lemma 9.2 
and 

d d 

|Fk,n n rk,„ n zG < ^ |i?k,n n n < 4det(,A0 

i=i i=i 

= oGAjGi)- 

2. Likewise, E(yo2„5^2j < [E(yo4jE(5i:4j] V2 = ^(l). 

3. Uy„E(y2j_(^Arj-iE(52 j|<4GAr„)-imax{[E(52 jE(5i:2j]V2^E(5^2j| 

o(l)- 

4. I Cov(yo^^„,5^„)| =0(1) by applying Lemmas 9.1 

and 9.2, Assumption A.3, and Condition My. with dis(i/k nCZ'^/Tk n; sF„n 

Z'')>4. 
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Since o'n(O) = Var(yQ „), the remaining quantity in (9.19) can be expressed 
as 


isNnf 

|</nol| 


<7n(0) 


-^Var([V'Zoo]2)(l + o(l)) 


det(s^72) 

det(A„)|i?o 


[2r^](l + o(l)) 


by applying the CLT [as in (9.9)] and Lemma 9.3. We have now estab¬ 
lished (9.18)(a). 

We omit the proof of (9.18)(b), which resembles the one establishing (9.13)(b) 
and incorporates arguments used to bound Uin,U 2 n] Nordman (2002) pro¬ 
vides more details. 


10. Proofs for bias expansions. We will use the following lemma concern¬ 
ing = NnYav{6n) to prove the theorems pertaining to bias expansions of 
'^n.OL and f2 nol- 

Lemma 10.1. Under the assumptions and conditions of Theorem 3.1, 
r2 = r2 + 0([det(A„)]-V--{2.4), 

Proof. By a Taylor expansion around //: 9n = H{Zi\r^) = H (/r) -|- -t- 

[replacing for Z-^ .^ in (9.12)] and so iV„Var(6'„) = A„Var(FAr^ -h 
For k G let A„(k) = j{i G i?„ n : i -|- k G Rn}\- It holds that 
A^n(k) < Nn and 

iV„<iVn(k) + l{iGZ'=':rn74 7^0, T 0] 

(10.1) r = i+iikiio,[-i,i]''}i 

<iv„(k)+ciikii^(Ar")"-' 

by the boundary condition on Rq. Also, by Lemma 9.1 and stationarity, for 
each k / 0 G 

(10.2) la(k)l <Cai(llklloo)^/(2r+5)^ 

Using j{k G Z'^: IJkjjoo = a;}] < Cx'^~^, x > 1, the covariances are absolutely 
summable over Z'^: 

OO 

y~] lo-(k)j < 1 (t(0)1 + C ^ < OO. 

keZ'i a;=l 


(10.3) 
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From (10.1)-(10.3), we find 

(10.4) Var(y^J = ^ E ^n(k)a(k) = 

141 E |iV„-iV4k)|.|cT(k)| 


(10.5) ^ ^^2d-l^^^^Y/(2r+5) 

x=l 

= 0([det(A„)]-i4). 

By Condition D and Lemma 9.2, it follows that NnYar{Q]\f^) = 0([det(A„)]“^). 

Finally, with bounds on the variance of yv„ and Qn„, we apply the 
Cauchy-Schwarz inequality to the covariance Cov(h)v„, Own)! = 0([det(A„)]“^4)^ 
setting the order on the difference Var(4) ~ t^I- D 


We give a few lemmas which help compute the bias of the estimators 

I'n.OL '^n.NOL- 

Lemma 10.2. Let Yi^n = (sM,n)"^ SsGZdniii „ A7'(Z(s) - ^), i e Sup¬ 
pose Assumptions A.1-A.5 and Conditions D 2 and M 2 +a hold with d>2 
with a as specified under Condition D 2 . Then 

E«Ol) - sNo,nEmj,E{rlf,OL) - 1^1"' E sNi,nE{%) 

is Jnol 

=:0([det(,A„)]-'/2) + o([det(,A„)]-i/''). 

Proof. We consider here only E(f^ q^). For integer sA„, the arguments 
for E(f^ nol) essentially the same; more details are provided in Nordman 
( 2002 ). ’ 

By stationarity and an algebraic expansion as in (9.12), 

E«ol) = sNn[E{YlJ + E(Q2 J 

+ 2E(yo,nQo,n) - E(y„2) _ E(g2) _ 2E(wg„)]. 

With the moment arguments based on Lemma 9.2 and Condition Dr, we 
have 

,A„E(yo2j<C, 

sNnEiQlJ, sNnE{Ql) < C{sNn)-\ 


( 10 . 6 ) 



42 


D. J. NORDMAN AND S. N. LAHIRI 


where the bound on sNn^{Y^) follows from (9.17). By Holder’s inequality 
and Assumption A.2, 

E«Ol) = sNnE{Yi^J + OiisNn)-^/^) + 0{sNn{Nn)-^). 

Note that E(yo^,n) = s^n = s^o,n- Hence, applying Lemma 9.3 and 

Assumption A.2, we establish Lemma 10.2 for f^oL- ^ 

The next lemma provides a small refinement to Lemma 10.2 made possible 
when the function H{-) is smoother. We shall make use of this lemma in bias 
expansions of ql and in lower sampling dimensions, namely d = 1 

or 2. 


Lemma 10.3. Assume d = l or 2. In addition to Assumptions A.1-A.5, 
suppose that Conditions and Ms+q hold with a as specified under Con¬ 
dition ZI3. Then 


E{floL) - sNo,nE(Y^^JMfl^OL) - I wr' E sNl,nE{Y,l) 

iG Jnol 

r 0([det(sA„)]“^), ifd=l, 

1 o([det(sA„)]“^/2), ifd = 2. 


Proof. We again consider only ql. For i G Jql, we use a third-order 
Taylor expansion of each subsample statistic around fi: = H{fi) +Y[^ri + 

Qi,n where Y[ ffi — V {Zi ^i 


Qln= E 

M M 0L\ 

||a|| 1 =2 





/i)) doj. 


Here denotes the remainder term in the Taylor expansion and Qi^n is de¬ 
fined a little differently here compared to (9.12). Write the sample means for 
the Taylor terms: Yn, Qn as before, Cn = |«/ol|~^ SieJoL 0,n- The moment 
inequalities in (10.6) are still valid and, by Lemma 9.2 and Condition D, 
we can produce bounds sAnE(C'E))sAnE(C'^) <C(sA'n)~^- By Holder’s in¬ 
equality and the scaling conditions from Assumptions A.l and A.2, we then 
have 


E{fl 


OL 


) 


,iV„[E(yo'_,,) + 2E(yo,nQo,n)] + 


0([det(5A„)] ^), 

o([det(5A„)]“^/2), 


if d = 1, 
if d = 2. 
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Since = sNo^nY{YQ^^), Lemma 10.3 for ql will follow by show¬ 

ing 

s./V„E(yb,n<3o,n) 

p 

(10.7) — s^n ^ ) Ci(ljkY[(^ZiQn lYji^ZjQyi /^)('^fc,0,n /^)] 

= 0([det(sA„)]“^), 

where Zq ^ = (Zi^o,n; • • •, -^p.o.n)^ G is a vector of coordinate sample means, 

Ci = dH{fi)/dxi-, ttj^k = 1/2 • d'^H{fi)/dxj dxk- 

Denote the observation Z{s) = {Zi{s), ..., Zp(s))' G s G Z'^. Fix i,j, k G 
{!,... ,p} and w.l.o.g. assume fj, = 0. Then sA„|E(Zi,o,n^j,o,n^fc,o,n)| = \lsNn)~^E{Zi(t)Zj{t)Zk{t)) + 
Lin+L'in \ where 

Llif = (,iV„)-2 ^ E[Z,(u)Z,-(v)Zfc(w)], 

u^v^-w^Z^nsRn 

Lt = {sNn)-^ E E[Z,(u)Z,(u)Z,(v) 
u.vGZdPsiin 

UT^v 

-L Zi{u)Zj{v)Zk{u) + Zi{v)Zj{u)Zkiu)]. 

By Lemma 9.1, Assumption A.3 and Condition 

p C30 

l4?l < -Gi:x''-‘c(x,l)‘/P"+*) =0([det(.A„)]->), 

X=1 

similarly to (10.3). For yi,y2, ys G define disadyi, y2, ys}) = maxi<i<3 dis({yi}, {yi,y2, ys} \ 
{y*}). If X > 1 G Z+, then |{(yi,y2) G (Z'^)^ : dis3({yi,y2,0}) = x}| < 
from Theorem 4.1, Lahiri (1999a). Thus, 

p oo 

X=1 

This establishes (10.7), completing the proof of Lemma 10.3 for t^ql- ^ 

We use the next lemma in the proof of Theorem 4.2. It allows us to 
approximate lattice point counts with Lebesgue volumes, in or M^, to a 
sufficient degree of accuracy. 

Lemma 10.4. Let d = 2,3 and Rq C (—1/2,1/2]“^ such that B° c Rq C B 
for a convex set B. Let {hn}^=i he a sequence of positive real numbers such 
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that —> oo. // k G then there exist Nk G Z+ and > 0 such that for 

n > Nk, i G 

|(|6„i2o|-|Z''n6„(i + i2o)|) 

— (|6„i?o n k + hnRif — n 5,i(i + Rq) H k + 6„(i + i?o)|)| 

{C2\MIo, ifd = 2, 

~\c3MUbl^^ + ^Knbl), tfd = 3, 

where {^k,n}5^Li C M is a nonnegative sequence (possibly dependent on'k) 
such that ^k,n —> 0. 

The proof is provided in Nordman and Lahiri (2002). 

To establish Lemma 4.1, we require some additional notation. For i,k G 
Z"^, and let sd^i,n{^) = jZ*^ n Ri n H k + Ri^n\ denote the number of sampling 
sites (lattice points) in the intersection of a NOL subregion with its k- 
translate. Note sA^i^n(k) is a subsample version of Nnik) from (10.1). 

Proof of Lemma 4.1. We start with bounds 

(10.8) sup \sNn - sNi,n\ < C(.Ar")''"\ 
ieZ'* 

IsNi^n - siVi,n(k)| 

(10.9) Pn7R^^0- rj=j + ||k||oo[-2,2]‘^}| 

^ PWlrW^ ( \max\d—1 
—'^ll^llooU'^n ) 5 

by the boundary condition on Rq (cf. Lemma 9.3) and infjggd ||sA„i—j||oo < 

1/2. 

Modify (10.4) by replacing iV„, iV„(k), Yat^ with ^iVi^„(k), = V'(Yi_„ - 

/i) (i.e., use a NOL subregion in place of the sampling region), and replace 
Nn, An, with the subsample analogs sd^i,n, sA„, in (10.5). We then 
find, using (10.3), for each i G Z'^, 

(10.10) ,A^i,„E(Yi2„) - r2 = Y. (sM,n(k) - Wi,n)a(k) = Ji^n, 

kezd 


sup \sli,n\ < sup AT l«l^i,n(k) sd^i,n\ ' |<z(k)| 


<C- 


( \maxAd —1 

U^n / 


sNn - C(,Ar")‘^"^ 


C50 

31=1 


= O([det(,A0]-'/"), 


( 10 . 11 ) 
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from (10.8), (10.9) and Assumption A.l. Now applying Lemma 10.1 and 
Assumption A.2 with Lemma 10.2 for d > 2 or Lemma 10.3 for d = 1, 
Lemma 4.1 follows. □ 


Proof of Theorem 4.1. Here sNi^n = sNn, sA'i,n(k) = C'n(k), E(y'i,„) = 
E(yo,n) for each i, k S (since s^n £ for NOL subsamples) and det(sA„) = 
sXn'^. Applying Lemma 10.2 for d>3 and Lemma 10.3 for d = 2, Lemma 10.1, 
Assumption A.2 and (10.11), 


E(f2)-r2 = 


-1 


An I Rq I 


5n(k) + o(A 




k&d- 


,,, sNn-CnO^) .A/|i 2 o| 
9n{k)^ -—-•a(k). 


.sA, 


.Nr,. 


From (10.11) and Lemma 9.3, it follows that bn(k)| <C, n G Z_|_, and 

that 5n(k) ^ C'(k)(T(k) for k G Z'^. By the LDCT, the proof of Theorem 4.1 
is complete. □ 


To establish Theorem 4.2, we require some additional notation. For i, k G 
Z'^, denote the difference between two Lebesgue volume-for-count approxi¬ 
mations as 


A,n(k) = (|.Ri,„| - sNi,n) - {\Ri,n H k -L Ri^n\ “ sA'i,n(k)) 
= (Ui?„| - - (|,i?„ n k + sRn\ - .A,n(k)). 


Proof of Theorem 4.2. We handle here the cases d = 2 or 3. Details 
on the proof for d = 1 are given in Nordman (2002). We note first that if 
F(k) exists for each k G Z'^ then Lemma 10.4 implies C'(k) = P(k). 

Consider t^nql. Applying Lemma 10.2 for d = 3, and Lemma 10.3 for 
d = 2, with (10.8), (10.10) and (10.11) gives 


E(Tn,NOL) “'T'n — IAolI ^ Yj 


is Jnol 


sjY 

cRt). 




Then, using (10.3), we can arrange terms to write 


iaolI ^ Y 


IsJnol 


sNj^r. 

sRn 


sli,n — 'I’n T 'y ] 

keZ'* 


Gn(k) . 

sAfi|d?g| 


Gn(k) 


E)i„(k)cj(k) 

is Jnol ^|Avol| 
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iov 'iin = -J2kez<i\sRn\ ^(II “ |s-Rn H k + |)o-(k) . Since iio IS convex, 

the boundary condition is valid and it holds that for all i, k G Z'^, 


( 10 . 12 ) 





from Lemma 9.3 and (10.9). Then (10.3), Lemma 10.4 and (10.12) give 
EkeZ'i l<^n(k)| < C, n G Z+; Gn(k) ^ 0 for k G Z'^ and sK'^n = 0(1)- By 
the LDCT, we establish 

X =o{s^n ^), B(f^ nOl) “ + o(l)), 

kGZd 1 


representing the formulation of Theorem 4.2 in terms of If 14(k) exists 
for each k G Z'^, then (10.3) and (10.12) imply that we can use the LDCT 
again to produce 


(10.13) 


^n = 


-1 




( ^(k)f^(k)Vl + o(l)). 


keZ'* 


The proof of Theorem 4.2 for jvjQL is now complete. 

Consider f^oL- repeat the same steps as above to find 



^'n + 


E 

keZ"^ 


g;(k) 

aXri, Rci 


T o(sAfi ), 


G;(k) 


Do,n(k)cr(k) 


The same arguments for Gn apply to G* and (10.13) remains valid when 
each 14(k) exists, kG Z"^, establishing Theorem 4.2 for t^ql- Note as well 
that if 14(k) exists for each k G Z'^, then Lemma 10.4 and Lemma 4.1 also 
imply the second formulation of the bias in Theorem 4.2. □ 


Proof of Theorem 5.1. This follows from Theorems 3.1 and 4.1 and 
simple arguments from calculus involving minimization of a smooth function 
of a real variable. □ 

Proof of Theorem 6.1. For a rectangle T, where nj=i(cj)Cj) C T C 
nj=i[cj,Cj], Cj,Cj G M, define the border Z'^-point set: 13{T} = = 

(si,..., SdY G n T: Sj G {cj,Cj}}. 

It holds that, for k 7^ 0, there exist C > 0, G Z_|_, such that n > Nk, 

(10.14) |A,n(k)| <C||k||^-^A/-2, iGZ'". 

This can be shown easily by considering only volume approximations for 
those Z'^ lattice point counts associated with the interior set Rq [i.e., treating 
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Rq as Rq in |L)i ,i(k)|] because the subtracted lattice point counts on the 
borders of Ri n and Ri n n k + Ri n are negligible: 

|^{sAn(i + Ro)}\ ~ |^{sAn(i + Rq) PI k + sAn(i + -Ro)}| 
<C||k||oo.A„'^■^ 

See Nordman (2002) for more details. 

Applying (10.14) in place of Lemma 10.4, the same proof for Theorem 4.2 
establishes Theorem 6.1. □ 

Acknowledgments. The authors thank the referees and an Associate Edi¬ 
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